Xen project Mailing List

Re: [Xen-devel] [PATCH v2] dm_op: Add xendevicemodel_modified_memory_bulk.

To: "Jennifer Herbert" <Jennifer.Herbert@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Thu, 23 Mar 2017 02:35:45 -0600

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, PaulDurrant <paul.durrant@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Thu, 23 Mar 2017 08:35:56 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 22.03.17 at 20:55, <Jennifer.Herbert@xxxxxxxxxx> wrote: > On 22/03/17 10:33, Jan Beulich wrote: >>>>> On 21.03.17 at 14:59, <jennifer.herbert@xxxxxxxxxx> wrote: >>> This version of this patch removes the need for the 'offset' parameter, >>> by instead reducing nr_extents, and working backwards from the end of >>> the array. >>> >>> This patch also removes the need to ever write back the passed array of >>> extents to the guest, but instead putting its partial progress in a >>> 'pfns_processed' parameter, which replaces the old 'offset' parameter. >>> As with the 'offest' parameter, 'pfns_processed' should be set to 0 on >>> calling. >> So how is the need for 'pfns_processed' better than the prior need >> for 'offset'? I continue to not see why you'd need any extra field, as >> you can write back to 'first_pfn' and 'nr' just like the old code did. (But >> note that I'm not outright objecting to this solution, I'd first like to >> understand why it was chosen.) >> > > With v1 of the patch, on continuation, two buffers get written back to > guest, one > to update 'offset', and one to update first_pfn (in the array). Although > I can't think of an example > where this would be a problem, I think that semi-randomly altering items > in the passed > array may not be expected, and if that data was re-used for something, > could cause a bug. > > There is precedence for changing the op buffer, and using > pfns_processed means we never have > to change this array, which I think is much cleaner, intuitive, and > halving the copy backs. > > I considered your suggestion of modifying the handle array, but I think > this would make the code much harder to follow, and still require data > written back to the guest - just in a different place. I thought just > reducing the nr_extents a cleaner solution. This backwards processing was a neat idea actually, since here we definitely don't have any ordering requirement. > I can't see the problem with having an argument for continuation - the > xen_dm_op will remain the same size, if we use the argument space or > not. If its just about how the API looks, I find this highly subjective > - its no secret that continuations happen, and data gets stashed in > these parameters - and so i see no reason why having a dedicated > parameter for this is a problem. > > If we want to hide this continuation information, we could define: > > struct xen_dm_op_modified_memory { > /* IN - number of extents. */ > uint32_t nr_extents; > } > > struct xen_dm_op_modified_memory modified_memory_expanded { > struct xen_dm_op_modified_memory modified_memory; > uint32_t pfns_processed > } > > and include both structures in the xen_dm_op union. > The caller would use xen_dm_op_modified_memory modified_memory, and due > to the memset(&op, 0, sizeof(op));, the value for pfns_processed would > end up as zero. Xen could use expanded structure, and make use of > pfns_processed. I don't think this would be helpful. > Or, we could re-purpose the 'pad' member of struct xen_dm_op, call it > continuation_data, and have this as a general purpose continuation > variable, that the caller wouldn't pay any attention to. > > What do you think? No, that's not really an option here imo, as in the general case we'd need this to be a 64-bit value. It just so happens that you can get away with a 32-bit one because you limit the input value to 32 bits. >>> --- a/xen/arch/x86/hvm/dm.c >>> +++ b/xen/arch/x86/hvm/dm.c >>> @@ -119,56 +119,89 @@ static int set_isa_irq_level(struct domain *d, >>> uint8_t isa_irq, >>> } >>> >>> static int modified_memory(struct domain *d, >>> - struct xen_dm_op_modified_memory *data) >>> + struct xen_dm_op_modified_memory *header, >>> + xen_dm_op_buf_t* buf) >>> { >>> - xen_pfn_t last_pfn = data->first_pfn + data->nr - 1; >>> - unsigned int iter = 0; >>> + /* Process maximum of 256 pfns before checking for continuation */ >>> + const unsigned int cont_check_interval = 0xff; >> Iirc the question was asked on v1 already: 256 (as the comment >> says) or 255 (as the value enforces)? > > aw shucks! I had it as 255, and Paul said "255 or 256?" and I changed > it to 256, without thinking, assuming an off by one error, and that he > was right. But, with a seconds thought, it should be 255, and Paul was > actually referring to a comment later in the code, which was 256. Yet batches of 256 would seem a little more "usual". >>> int rc = 0; >>> if ( copy_from_guest_offset(&extent, buf->h, rem_extents - 1, 1) ) >>> + return -EFAULT; >>> + >>> + pfn = extent.first_pfn + header->pfns_processed; >>> + batch_nr = extent.nr - header->pfns_processed; >> Even if this looks to be harmless to the hypervisor, I dislike the >> overflows you allow for here. > > I'll add a check for (header->pfns_processed > extent.nr), returning > -EINVAL. >= actually, I think. >>> @@ -441,13 +474,8 @@ static int dm_op(domid_t domid, >>> struct xen_dm_op_modified_memory *data = >>> &op.u.modified_memory; >>> >>> - const_op = false; >>> - >>> - rc = -EINVAL; >>> - if ( data->pad ) >>> - break; >> Where did this check go? > > data->pad is undefined here. But I could add a check for extent.pad in > the modified_memory function. That's what I'm asking for. Talking of which - is there a way for the caller to know on which extent an error (if any) has occurred? This certainly needs to be possible. >>> --- a/xen/include/public/hvm/dm_op.h >>> +++ b/xen/include/public/hvm/dm_op.h >>> @@ -237,13 +237,24 @@ struct xen_dm_op_set_pci_link_route { >>> * XEN_DMOP_modified_memory: Notify that a set of pages were modified by >>> * an emulator. >>> * >>> - * NOTE: In the event of a continuation, the @first_pfn is set to the >>> - * value of the pfn of the remaining set of pages and @nr reduced >>> - * to the size of the remaining set. >>> + * DMOP buf 1 contains an array of xen_dm_op_modified_memory_extent with >>> + * @nr_extent entries. The value of @nr_extent will be reduced on >>> + * continuation. >>> + * >>> + * @pfns_processed is used for continuation, and not intended to be usefull >>> + * to the caller. It gives the number of pfns already processed (and can >>> + * be skipped) in the last extent. This should always be set to zero. >>> */ >>> #define XEN_DMOP_modified_memory 11 >>> >>> struct xen_dm_op_modified_memory { >>> + /* IN - number of extents. */ >>> + uint32_t nr_extents; >>> + /* IN - should be set to 0, and is updated on continuation. */ >>> + uint32_t pfns_processed; >> I'd prefer if the field (if to be kept) wasn't named after its current >> purpose, nor should we state anything beyond it needing to be >> zero when first invoking the call. These are implementation details >> which callers should not rely on. > > Assuming we keep it, how about calling it "reserved", with a comment in > dm.c explaining what its actually for? Elsewhere we use "opaque", but "reserved" is probably fine too. However, we may want to name it as an OUT value, for the error-on-extent indication mentioned above (of course another option would be to make nr_extent IN/OUT). As an OUT, we're free to use it for whatever intermediate value, just so long as upon final return to the caller it has the intended value. (It's debatable whether it should be IN/OUT, due to the need for it to be set to zero.) Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.