[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] xenstore: add support for reading directory with many children



On 25/10/16 15:20, Jan Beulich wrote:
>>>> On 25.10.16 at 13:41, <JGross@xxxxxxxx> wrote:
>> On 25/10/16 11:06, Jan Beulich wrote:
>>>>>> On 25.10.16 at 07:52, <JGross@xxxxxxxx> wrote:
>>>> --- a/tools/xenstore/xenstored_transaction.c
>>>> +++ b/tools/xenstore/xenstored_transaction.c
>>>> @@ -79,6 +79,11 @@ struct transaction
>>>>  
>>>>    /* List of changed domains - to record the changed domain entry number 
>>>> */
>>>>    struct list_head changed_domains;
>>>> +
>>>> +  /* Temporary buffer for XS_DIRECTORY_PART. */
>>>> +  char *dirpart_buf;
>>>> +  unsigned buf_off;
>>>> +  unsigned buf_len;
>>>>  };
>>>
>>> The buffer you allocate for this can - by nature of this change - be
>>> huge, and there's one per transaction. Isn't this causing a resource
>>> utilization issue?
>>
>> It will be allocated only when needed and its size is less than that of
>> the node to be read.
> 
> That's not addressing my concern. A DomU could have multiple
> transactions each with such an operation in progress. And the
> space here doesn't get accounted against any of the quota.

It is accounted against the maximum number of transactions.

Additionally the buffer will be kept only if not all of the data
could be transferred in the first iteration of the read.

> 
>>>> @@ -280,6 +285,58 @@ void conn_delete_all_transactions(struct connection 
>>>> *conn)
>>>>    conn->transaction_started = 0;
>>>>  }
>>>>  
>>>> +void send_directory_part(struct connection *conn, struct buffered_data 
>>>> *in)
>>>> +{
>>>> +  struct transaction *trans = conn->transaction;
>>>> +  struct node *node;
>>>> +  const char *name = onearg(in);
>>>> +  unsigned len;
>>>> +
>>>> +  if (name == NULL || trans == NULL) {
>>>> +          send_error(conn, EINVAL);
>>>> +          return;
>>>> +  }
>>>> +
>>>> +  if (name[0] == '@' && name[1] == 0) {
>>>> +          if (trans->dirpart_buf == NULL) {
>>>> +                  send_error(conn, EINVAL);
>>>> +                  return;
>>>> +          }
>>>> +  } else {
>>>> +          if (trans->dirpart_buf) {
>>>> +                  talloc_free(trans->dirpart_buf);
>>>> +                  trans->dirpart_buf = NULL;
>>>> +          }
>>>> +
>>>> +          name = canonicalize(conn, name);
>>>> +          node = get_node(conn, in, name, XS_PERM_READ);
>>>> +          if (!node) {
>>>> +                  send_error(conn, errno);
>>>> +                  return;
>>>> +          }
>>>> +          trans->dirpart_buf = talloc_array(trans, char,
>>>> +                                            node->childlen + 1);
>>>
>>> Once again, especially considering the buffer possibly being huge,
>>> shouldn't you check for allocation failure here?
>>
>> I followed coding style from xenstored. Modifying this style should be
>> another patch set IMHO (I'd be fine doing this).
> 
> Cloning buggy code is never a good idea imo.
> 
>>>> +          memcpy(trans->dirpart_buf, node->children, node->childlen);
>>>> +          trans->dirpart_buf[node->childlen] = 0;
>>>> +          trans->buf_off = 0;
>>>> +          trans->buf_len = node->childlen + 1;
>>>> +  }
>>>> +
>>>> +  if (trans->buf_len - trans->buf_off > 1024)
>>>
>>> What has this 1024 been derived from? In particular, why is it not
>>> (close to) the 4096 limit mentioned in the description?
>>
>> In theory a single entry could be up to 2048 bytes long. I wanted to
>> keep the logic simple by not trying to iterate until the limit is
>> reached. I can change that, of course.
> 
> While I think that would be worthwhile, I don't think this addresses
> my comment: You don't really explain where the 1024 is coming from.
> Instead to me your response looks to be related to the following
> point.

The 1024 is some arbitrary choice. I'll change it with the iteration
towards the allowed maximum added.

> 
>>>> +          len = strlen(trans->dirpart_buf + trans->buf_off) + 1;
>>>
>>> Looking at construct_node() I'm getting the impression that there
>>> are embedded nul characters in the ->children array, in which case
>>> this strlen() would likely chop off values which could still fit, requiring
>>> needlessly many iterations. And the explicit nul termination after the
>>> allocation above would then also appear to be unnecessary.
>>
>> Each children member is a nul terminated string, so I need to keep
>> the nul bytes in the result. And the final nul byte is the indicator
>> for the last entry being reached.
> 
> Right, but this means you transfer only a single entry here. That was
> the point of my original concern.

And only now I realize that you are right. I meant to transfer at
least 1024 bytes but didn't do so. I'll correct this.

> 
>>> And then - why did you put this function here instead of in
>>> xenstored_core.c, e.g. next to send_directory()?
>>
>> It is using the xenstored_transaction.c private struct transaction.
>> Putting it in xenstored_core.c would have required to move the
>> struct definition into a header file.
> 
> Yeah, I've realized this after having sent the reply, but then again
> this also relates to the first question above (whether this should be
> a per-transaction item in the first place). In particular I don't think
> this being per transaction helps with allowing the guest to (easily)
> run multiple such queries in parallel, as commonly simple operations
> get run with null transactions.

The XS_DIRECTORY_PART command is valid in a transaction only. This is
a prerequisite to be able to split the operation into multiple pieces
while keeping the consistency.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.