[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Minios-devel] [UNIKRAFT PATCH] lib/uknetdev: Introduce receive buffer allocator callback

To: Simon Kuenzer <simon.kuenzer@xxxxxxxxx>, <minios-devel@xxxxxxxxxxxxx>
From: Sharan Santhanam <sharan.santhanam@xxxxxxxxx>
Date: Sun, 16 Dec 2018 11:47:12 +0100
Delivery-date: Sun, 16 Dec 2018 10:47:28 +0000
List-id: Mini-os development list <minios-devel.lists.xenproject.org>

Hello Simon,

On 12/15/18 3:22 PM, Simon Kuenzer wrote:

Hey Sharan

On 14.12.18 13:33, Sharan Santhanam wrote:
Hello Simon,

Please find the comment inline.

Thanks & Regards
Sharan

On 12/14/18 11:21 AM, Simon Kuenzer wrote:
Hey Sharan,
thanks a lot for your comments. Let me know what you think to myreplies ;-).
Thanks,

Simon

On 13.12.18 17:54, Sharan Santhanam wrote:
Hello Simon,


Please find my comments inline.

Thanks & Regards
Sharan


On 12/13/18 7:44 AM, Simon Kuenzer wrote:
Sorry, this patch is a bit long but I wanted to keep everythingfunctional working.
On 13.12.18 07:41, Simon Kuenzer wrote:
The idea of having a fill-up parameter on the receive function is
inconvenient because a user never knows how many receive buffers have
to be programmed before calling the receive function.
This commit introduces registering a callback when configuring a
receive queue. This callback has to be provided by the NETDEV APIuserand is called by the driver whenever it programs new receivebuffers to
the receive queue. In order to still provide performance, this
allocation callback function is called for a batch of receivebuffers.This new mechanism replaces the fill-up parameter on the receivefunction.
This commit also adopts virtio-net in order to comply with the APIchange.
Signed-off-by: Simon Kuenzer <simon.kuenzer@xxxxxxxxx>
---
  lib/uknetdev/include/uk/netdev.h        |  41 +++----------
  lib/uknetdev/include/uk/netdev_core.h   |  26 +++++++-
  lib/uknetdev/netdev.c                   |   1 +
  plat/drivers/include/virtio/virtqueue.h |   9 +++
plat/drivers/virtio/virtio_net.c | 104++++++++++++++++++++------------
  plat/drivers/virtio/virtio_ring.c       |   9 ++-
  6 files changed, 114 insertions(+), 76 deletions(-)
diff --git a/lib/uknetdev/include/uk/netdev.hb/lib/uknetdev/include/uk/netdev.h
index f0fa769..b5ce96d 100644
--- a/lib/uknetdev/include/uk/netdev.h
+++ b/lib/uknetdev/include/uk/netdev.h
@@ -419,9 +419,7 @@ static inline intuk_netdev_rxq_intr_disable(struct uk_netdev *dev,
  }
  /**
- * Receive one packet and re-program used receive descriptor
- * Please note that before any packet can be received, thereceive queue
- * has to be filled up with empty netbufs (see fillup parameter).
+ * Receive one packet and re-program used receive descriptors
   *
   * @param dev
   *   The Unikraft Network Device.
@@ -431,52 +429,29 @@ static inline intuk_netdev_rxq_intr_disable(struct uk_netdev *dev,
   *   to uk_netdev_configure().
   * @param pkt
* Reference to netbuf pointer which will be point to thereceived packet- * after the function call. Can be NULL if function is used toprogram
- *   receive descriptors only.
- * @param fillup
- * Array of netbufs that should be used to program useddescriptors again.- * Each of the netbuf should be freshly allocated/initializedand not part
- *   of any chain.
- * `fillup` can be `NULL` but without re-programming of useddescriptors no
- *   new packets can be received at some point.
- * @param fillup_count
- * Length of `fillup` array. After the function call,`fillup_count` returns- * the number of left and unused netbufs on the array.`fillup_count` has to
- *   to 0 if `fillup` is `NULL`.
+ *   after the function call. `pkt` has never to be `NULL`.
   * @return
- *   - (0): No packet available or `pkt` was set to NULL,
- *          check `fillup_count` for used `fillup` netbufs
- *   - (1): `pkt` points to received netbuf,
- *          check `fillup_count` for used `fillup` netbufs
+ *   - (0): No packet available
+ *   - (1): `pkt` points to received netbuf
* - (2): `pkt` points to received netbuf but more receivedpackets are * available on the receive queue. When interrupts areused, they are
- *          disabled until 1 is returned on subsequent calls,
- *          check `fillup_count` for used `fillup` netbufs
+ *          disabled until 1 is returned on subsequent calls
   *   - (<0): Error code from driver
   */
static inline int uk_netdev_rx_one(struct uk_netdev *dev,uint16_t queue_id,
-                   struct uk_netbuf **pkt,
-                   struct uk_netbuf *fillup[],
-                   uint16_t *fillup_count)
+                   struct uk_netbuf **pkt)
I like the idea of having a callback refill the user buffer but inthe previous API, we had a way of communicating with the user thenumber of buffers refilled. In this API, we fail silently if thebuffer refill did not work.
The actual failure happens now within the `uk_netdev_alloc_rxpkts()`callback which I think makes the receive API much easier to use.
I agree the callback provides a cleaner interface for filling up thememory buffers.
This callback is provided by the user so it is the user who is awareof the failure anyways.
I would prefer if we add a status parameter or a return code regardingthe error. Otherwise the user may have to set flag in the callback tohand over information from the allocator and the receive function.
Hum, probably the easiest to use with the least overhead are returncodes as bit flags. I would add a flag that will return that a queueunderrun happened (less descriptors re-programmed for receiving packetsthan slots available on the queue). Programs interested in it, couldlook at this flags and react accordingly. Others would ignore it. Whatdo you think?


I agree with this proposal.

  {
      UK_ASSERT(dev);
      UK_ASSERT(dev->rx_one);
      UK_ASSERT(queue_id < CONFIG_LIBUKNETDEV_MAXNBQUEUES);
      UK_ASSERT(dev->_data->state == UK_NETDEV_RUNNING);
      UK_ASSERT(!PTRISERR(dev->_rx_queue[queue_id]));
-    UK_ASSERT((!fillup && fillup_count) || fillup);
+    UK_ASSERT(pkt);
-    return dev->rx_one(dev, dev->_rx_queue[queue_id], pkt,
-               fillup, fillup_count);
+    return dev->rx_one(dev, dev->_rx_queue[queue_id], pkt);
  }
  /**
- * Shortcut for only filling up a receive queue with empty netbufs
- */
-#define uk_netdev_rx_fillup(dev, queue_id, fillup,fillup_count) \- uk_netdev_rx_one((dev), (queue_id), NULL, (fillup),(fillup_count))
-
-/**
   * Transmit one packet
   *
   * @param dev
diff --git a/lib/uknetdev/include/uk/netdev_core.hb/lib/uknetdev/include/uk/netdev_core.h
index b77c45a..d30886d 100644
--- a/lib/uknetdev/include/uk/netdev_core.h
+++ b/lib/uknetdev/include/uk/netdev_core.h
@@ -176,6 +176,25 @@ typedef void(*uk_netdev_queue_event_t)(struct uk_netdev *dev,
                      uint16_t queue_id, void *argp);
  /**
+ * User callback used by the driver to allocate netbufs
+ * that are used to setup receive descriptors.
+ *
+ * @param argp
+ *   User-provided argument.
+ * @param pkts
+ *   Array for netbuf pointers that the function should allocate.
+ * @param count
+ *   Number of netbufs requested (equal to length of pkts).
+ * @return
+ *   Number of successful allocated netbufs,
+ *   has to be in range [0, count].
+ * References to allocated packets are placed topkts[0]...pkts[count -1].
+ */
+typedef uint16_t (*uk_netdev_alloc_rxpkts)(void *argp,
+                       struct uk_netbuf *pkts[],
+                       uint16_t count);
+
Don't we want to return an error to the user in case bufferallocation failed.
It is an interface doing batching. The API user is implementing thefunction so it is up to the user to handle the error. I think thedriver shouldn't care too much about it because a likely error isthat receive buffer pools my be dimensioned too small to cope withincoming packet rates. It would be even further painful to handleerrors in the driver other than just letting descriptors unprogrammed.Because of this, I think it is best to give the driver what we couldsuccessfully allocate. Actually the driver could guess that there wassomething going wrong (e.g., out of memory) if the function returnsless netbufs than requested with `count`. `0` is also valid returnvalue (so no netbuf allocated) which ends up in no descriptor isprogrammed.
I am not suggesting the driver to handle the driver. Our driver is theuser of the API and we must forward an error return from the allocatorto the user of the receive function. Maybe on returning '0' we reportback to user the allocator failed.
I would report back now as soon returned count is less then therequested number of allocations.


I agree.

+/**
* A structure used to configure an Unikraft network device RXqueue.
   */
  struct uk_netdev_rxqueue_conf {
@@ -183,6 +202,9 @@ struct uk_netdev_rxqueue_conf {
void *callback_cookie; /**< Argument pointer forcallback. */ struct uk_alloc *a; /**< Allocator fordescriptors. */
+
+ uk_netdev_alloc_rxpkts alloc_rxpkts; /**< Allocator for rxnetbufs */+ void *alloc_rxpkts_argp; /**< Argument foralloc_rxpkts */
  #ifdef CONFIG_LIBUKNETDEV_DISPATCHERTHREADS
struct uk_sched *s; /**< Scheduler fordispatcher. */
  #endif
@@ -266,9 +288,7 @@ typedef int(*uk_netdev_rxq_intr_disable_t)(struct uk_netdev *dev,
  /** Driver callback type to retrieve one packet from a RX queue. */
  typedef int (*uk_netdev_rx_one_t)(struct uk_netdev *dev,
                    struct uk_netdev_rx_queue *queue,
-                  struct uk_netbuf **pkt,
-                  struct uk_netbuf *fillup[],
-                  uint16_t *fillup_count);
+                  struct uk_netbuf **pkt);
  /** Driver callback type to submit one packet to a TX queue. */
  typedef int (*uk_netdev_tx_one_t)(struct uk_netdev *dev,
diff --git a/lib/uknetdev/netdev.c b/lib/uknetdev/netdev.c
index 8f7dd6e..3b92f62 100644
--- a/lib/uknetdev/netdev.c
+++ b/lib/uknetdev/netdev.c
@@ -333,6 +333,7 @@ int uk_netdev_rxq_configure(struct uk_netdev*dev, uint16_t queue_id,
      UK_ASSERT(dev->ops->rxq_configure);
      UK_ASSERT(queue_id < CONFIG_LIBUKNETDEV_MAXNBQUEUES);
      UK_ASSERT(rx_conf);
+    UK_ASSERT(rx_conf->alloc_rxpkts);
  #ifdef CONFIG_LIBUKNETDEV_DISPATCHERTHREADS
      UK_ASSERT((rx_conf->callback && rx_conf->s)
            || !rx_conf->callback);
diff --git a/plat/drivers/include/virtio/virtqueue.hb/plat/drivers/include/virtio/virtqueue.h
index 3d72c50..e8bfe29 100644
--- a/plat/drivers/include/virtio/virtqueue.h
+++ b/plat/drivers/include/virtio/virtqueue.h
@@ -184,6 +184,15 @@ struct virtqueue *virtqueue_create(__u16queue_id, __u16 nr_descs, __u16 align,
                     struct virtio_dev *vdev, struct uk_alloc *a);
  /**
+ * Return the number of available descriptors of a virtqueue
+ * @param vq
+ *    A reference to the virtqueue.
+ * @return __uint16
+ *    Number of available descriptors
+ */
+__u16 virtqueue_avail(struct virtqueue *vq);
+
+/**
   * Check the virtqueue if full.
   * @param vq
   *    A reference to the virtqueue.
diff --git a/plat/drivers/virtio/virtio_net.cb/plat/drivers/virtio/virtio_net.c
index 75a6cd6..5c7162e 100644
--- a/plat/drivers/virtio/virtio_net.c
+++ b/plat/drivers/virtio/virtio_net.c
@@ -128,6 +128,9 @@ struct uk_netdev_rx_queue {
      uint16_t nb_desc;
      /* The flag to interrupt on the transmit queue */
      uint8_t intr_enabled;
+    /* User-provided receive buffer allocation function */
+    uk_netdev_alloc_rxpkts alloc_rxpkts;
+    void *alloc_rxpkts_argp;
      /* Reference to the uk_netdev */
      struct uk_netdev *ndev;
      /* The scatter list and its associated fragements */
@@ -197,9 +200,7 @@ static int virtio_netdev_xmit(struct uk_netdev*dev,
                    struct uk_netbuf *pkt);
  static int virtio_netdev_recv(struct uk_netdev *dev,
                    struct uk_netdev_rx_queue *queue,
-                  struct uk_netbuf **pkt,
-                  struct uk_netbuf *fillup[],
-                  uint16_t *fillup_count);
+                  struct uk_netbuf **pkt);
static const struct uk_hwaddr *virtio_net_mac_get(structuk_netdev *n);
  static __u16 virtio_net_mtu_get(struct uk_netdev *n);
  static unsigned virtio_net_promisc_get(struct uk_netdev *n);
@@ -213,7 +214,7 @@ static int virtio_netdev_rxq_enqueue(structuk_netdev_rx_queue *rxq,
                       struct uk_netbuf *netbuf);
static int virtio_netdev_recv_done(struct virtqueue *vq, void*priv);
  static int virtio_netdev_rx_fillup(struct uk_netdev_rx_queue *rxq,
-        struct uk_netbuf **netbuf, __u16 *count);
+                   int notify);
  /**
   * Static global constants
@@ -262,39 +263,62 @@ static void virtio_netdev_xmit_free(structuk_netdev_tx_queue *txq)
      uk_pr_debug("Free %"__PRIu16" descriptors\n", cnt);
  }
-static int virtio_netdev_rx_fillup(struct uk_netdev_rx_queue *rxq,
-        struct uk_netbuf **netbuf, __u16 *count)
+#define RX_FILLUP_BATCHLEN 64
+
+static int virtio_netdev_rx_fillup(struct uk_netdev_rx_queue*rxq, int notify)
  {
+    struct uk_netbuf *netbuf[RX_FILLUP_BATCHLEN];
      int rc = 0;
-    __u16 i = 0;
-    __u16 cnt = 0;
+    __u16 i, j;
+    __u16 req;
+    __u16 cnt;
+    __u16 filled = 0;
      /**
* Fixed amount of memory is allocated to each receivedbuffer. In * our case since we don't support jumbo frame or LRO yet werequire
       * that the buffer feed to the ring descriptor is atleast
       * ethernet MTU + virtio net header.
+ * Because we using 2 descriptor for a single netbuf, oureffective
+     * queue size is just the half.
       */
-    for (i = 0; i < *count; i++) {
-        rc = virtio_netdev_rxq_enqueue(rxq, netbuf[i]);
-        if (rc == -ENOSPC) {
-            uk_pr_debug(
-                "No more place available to add descriptors\n");
-            rc = 0;
-            break;
-        } else if (unlikely(rc < 0)) {
- uk_pr_err("Failed to add a buffer to the virtqueue:%d\n",
-                  rc);
-            break;
+    while ((req = (virtqueue_avail(rxq->vq) / 2))) {
+        req = MIN(req, RX_FILLUP_BATCHLEN);
+ cnt = rxq->alloc_rxpkts(rxq->alloc_rxpkts_argp, netbuf,req);
+        for (i = 0; i < cnt; i++) {
+ uk_pr_debug("Enqueue netbuf %"PRIu16"/%"PRIu16" (%p)to virtqueue %p...\n",
+                    i + 1, cnt, netbuf[i], rxq);
+            rc = virtio_netdev_rxq_enqueue(rxq, netbuf[i]);
While enqueuing into the virtio-ring the API returns the number ofdescriptor availble. The moment rc == 0, the queue is full.
As a suggestion we could also change "virtqueue_buffer_dequeue" toreturn the number of descriptor available, thereby eliminating theneed for fetch the queue available.
int virtqueue_buffer_dequeue(struct virtqueue *vq, void **buf,
                   __u32 *len);
I agree, sounds right. I will look into this.
+            if (unlikely(rc < 0)) {
Do we want to report ring full as an error?
That should be now a case that shouldn't happen anymore because thisfunction is querying the number of available slots before asking forallocation. It will exactly fit or less (in case we are in memorypressure). If not, something really bad happened in the driver.However, the general purpose error handling is also catching thiskind of failure. I don't think we need a separate error handling forthis anymore.

yeah seems reasonable

+ uk_pr_err("Failed to add a buffer to receivevirtqueue %p: %d\n",
+                      rxq, rc);
+
+                /*
+                 * Release netbufs that we are not going
+                 * to use anymore
+                 */
+                for (j = i; j < cnt; j++)
Should we free this memory or give it back to user? Wouldn't freeingit up make it expensive?
Because we have a unlikely terrible situation, I think theperformance of the error clean-up is less important. The problem Isee with your suggestion is that we would need another callback togive the buffer back to the user and that would bloat the API.However, the user is actually able to hook into the free operationalready: The destructor callback of netbufs can be used for this.


agreed

+                    uk_netbuf_free(netbuf[j]);
+                goto out;
+            }
+            filled++;
+        }
+
+        if (unlikely(cnt < req)) {

+ uk_pr_debug("Failed to complete fill-up of receivevirtqueue %p: Out of memory",

+                    rxq);
+            goto out;
          }
-        cnt++;
      }
-    *count = *count - cnt;
+
+out:

+ uk_pr_debug("Programmed %"PRIu16" receive netbufs to receivevirtqueue %p\n",

+            filled, rxq);
      /**
       * Notify the host, when we submit new descriptor(s).
       */
-    if (cnt)
+    if (notify && filled)
          virtqueue_host_notify(rxq->vq);
      return rc;

@@ -498,34 +522,29 @@ static int virtio_netdev_rxq_dequeue(structuk_netdev_rx_queue *rxq,

  static int virtio_netdev_recv(struct uk_netdev *dev,
                    struct uk_netdev_rx_queue *queue,
-                  struct uk_netbuf **pkt,
-                  struct uk_netbuf *fillup[],
-                  uint16_t *fillup_count)
+                  struct uk_netbuf **pkt)
  {
      int rc = 0;
      int cnt = 0;
      UK_ASSERT(dev && queue);
-    UK_ASSERT(!fillup || (fillup && *fillup_count > 0));
+    UK_ASSERT(pkt);
-    if (pkt && (queue->intr_enabled & VTNET_INTR_USR_EN_MASK)) {
+    if (queue->intr_enabled & VTNET_INTR_USR_EN_MASK) {
          virtqueue_intr_disable(queue->vq);
          queue->intr_enabled &= ~(VTNET_INTR_EN);
      }
-    if (pkt) {
-        rc = virtio_netdev_rxq_dequeue(queue, pkt);
-        if (unlikely(rc < 0)) {
-            uk_pr_err("Failed to dequeue the packet: %d\n", rc);
-            goto err_exit;
-        }
-        cnt = rc;
+    rc = virtio_netdev_rxq_dequeue(queue, pkt);
+    if (unlikely(rc < 0)) {
+        uk_pr_err("Failed to dequeue the packet: %d\n", rc);
+        goto err_exit;
      }
-    if (fillup)
-        virtio_netdev_rx_fillup(queue, fillup, fillup_count);
+    cnt = rc;
+    virtio_netdev_rx_fillup(queue, 1);
      /* Enable interrupt only when user had previously enabled it */
-    if (pkt && (queue->intr_enabled & VTNET_INTR_USR_EN_MASK)) {
+    if (queue->intr_enabled & VTNET_INTR_USR_EN_MASK) {
          /* Need to enable the interrupt on the last packet */
          rc = virtqueue_intr_enable(queue->vq);
          if (rc == 1 && cnt == 0) {

@@ -542,11 +561,13 @@ static int virtio_netdev_recv(structuk_netdev *dev,

              /* Need to enable the interrupt on the last packet */
              rc = virtqueue_intr_enable(queue->vq);
              cnt = (rc == 1) ? 2 : 1;
+            /* Since we received something, we need to fillup */
+            virtio_netdev_rx_fillup(queue, 1);
          } else if (cnt > 0) {
              /* When there is packet in the buffer */
              cnt = (rc == 1) ? 2 : 1;
          }
-    } else if (pkt && cnt > 0) {
+    } else if (cnt > 0) {
          /**
           * For polling case, we report always there are further
           * packets unless the queue is empty.

@@ -573,6 +594,8 @@ static struct uk_netdev_rx_queue*virtio_netdev_rx_queue_setup(

      UK_ASSERT(n);
      UK_ASSERT(conf);
+    UK_ASSERT(conf->alloc_rxpkts);
+
      vndev = to_virtionetdev(n);
      if (queue_id >= vndev->max_vqueue_pairs) {
          uk_pr_err("Invalid virtqueue identifier: %"__PRIu16"\n",

@@ -589,6 +612,11 @@ static struct uk_netdev_rx_queue*virtio_netdev_rx_queue_setup(

          goto err_exit;
      }
      rxq  = &vndev->rxqs[rc];
+    rxq->alloc_rxpkts = conf->alloc_rxpkts;
+    rxq->alloc_rxpkts_argp = conf->alloc_rxpkts_argp;
+
+    /* Allocate receive buffers for this queue */
+    virtio_netdev_rx_fillup(rxq, 0);
  exit:
      return rxq;

diff --git a/plat/drivers/virtio/virtio_ring.cb/plat/drivers/virtio/virtio_ring.c

index 02d568a..5eaa7e7 100644
--- a/plat/drivers/virtio/virtio_ring.c
+++ b/plat/drivers/virtio/virtio_ring.c

@@ -416,12 +416,17 @@ void virtqueue_destroy(struct virtqueue *vq,struct uk_alloc *a)

      uk_free(a, vrq);
  }
-int virtqueue_is_full(struct virtqueue *vq)
+__u16 virtqueue_avail(struct virtqueue *vq)
  {
      struct virtqueue_vring *vrq;
      UK_ASSERT(vq);
      vrq = to_virtqueue_vring(vq);
-    return (vrq->desc_avail == 0);
+    return vrq->desc_avail;
+}
+
+int virtqueue_is_full(struct virtqueue *vq)
+{
+    return (virtqueue_avail(vq) == 0);
  }

Thanks & Regards
Sharan

_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/minios-devel

References:
- [Minios-devel] [UNIKRAFT PATCH] lib/uknetdev: Introduce receive buffer allocator callback
  - From: Simon Kuenzer
- Re: [Minios-devel] [UNIKRAFT PATCH] lib/uknetdev: Introduce receive buffer allocator callback
  - From: Simon Kuenzer
- Re: [Minios-devel] [UNIKRAFT PATCH] lib/uknetdev: Introduce receive buffer allocator callback
  - From: Sharan Santhanam
- Re: [Minios-devel] [UNIKRAFT PATCH] lib/uknetdev: Introduce receive buffer allocator callback
  - From: Simon Kuenzer
- Re: [Minios-devel] [UNIKRAFT PATCH] lib/uknetdev: Introduce receive buffer allocator callback
  - From: Sharan Santhanam
- Re: [Minios-devel] [UNIKRAFT PATCH] lib/uknetdev: Introduce receive buffer allocator callback
  - From: Simon Kuenzer

Prev by Date: Re: [Minios-devel] [UNIKRAFT PATCH v2] lib/uknetdev: Introduce receive buffer allocator callback
Next by Date: Re: [Minios-devel] [UNIKRAFT PATCH v2] lib/uknetdev: Introduce receive buffer allocator callback
Previous by thread: Re: [Minios-devel] [UNIKRAFT PATCH] lib/uknetdev: Introduce receive buffer allocator callback
Next by thread: [Minios-devel] [UNIKRAFT PATCHv3 00/25] Follow up patches for Arm64 support series#1
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.