Xen project Mailing List

Re: [Xen-devel] [PATCH RFC 08/20] libxl/migration: add precopy tuning parameters

On Thu, Mar 30, 2017 at 02:03:29AM -0400, Joshua Otto wrote: > On Wed, Mar 29, 2017 at 10:08:02PM +0100, Andrew Cooper wrote: > > On 27/03/17 10:06, Joshua Otto wrote: > > > In the context of the live migration algorithm, the precopy iteration > > > count refers to the number of page-copying iterations performed prior to > > > the suspension of the guest and transmission of the final set of dirty > > > pages. Similarly, the precopy dirty threshold refers to the dirty page > > > count below which we judge it more profitable to proceed to > > > stop-and-copy rather than continue with the precopy. These would be > > > helpful tuning parameters to work with when migrating particularly busy > > > guests, as they enable an administrator to reap the available benefits > > > of the precopy algorithm (the transmission of guest pages _not_ in the > > > writable working set can be completed without guest downtime) while > > > reducing the total amount of time required for the migration (as > > > iterations of the precopy loop that will certainly be redundant can be > > > skipped in favour of an earlier suspension). > > > > > > To expose these tuning parameters to users: > > > - introduce a new libxl API function, libxl_domain_live_migrate(), > > > taking the same parameters as libxl_domain_suspend() _and_ > > > precopy_iterations and precopy_dirty_threshold parameters, and > > > consider these parameters in the precopy policy > > > > > > (though a pair of new parameters on their own might not warrant an > > > entirely new API function, it is added in anticipation of a number of > > > additional migration-only parameters that would be cumbersome on the > > > whole to tack on to the existing suspend API) > > > > > > - switch xl migrate to the new libxl_domain_live_migrate() and add new > > > --postcopy-iterations and --postcopy-threshold parameters to pass > > > through > > > > > > Signed-off-by: Joshua Otto <jtotto@xxxxxxxxxxxx> > > > > This will have to defer to the tools maintainers, but I purposefully > > didn't expose these knobs to users when rewriting live migration, > > because they cannot be meaningfully chosen by anyone outside of a > > testing scenario. (That is not to say they aren't useful for testing > > purposes, but I didn't upstream my version of this patch.) > > Ahhh, I wondered why those parameters to xc_domain_save() were present > but ignored. That's reasonable. > > I guess the way I had imagined an administrator using them would be in a > non-production/test environment - if they could run workloads > representative of their production application in this environment, they > could experiment with different --precopy-iterations and > --precopy-threshold values (having just a high-level understanding of > what they control) and choose the ones that result in the best outcome > for later use in production. > Running in a test environment isn't always an option -- think about public cloud providers who don't have control over the VMs or the workload. > > I spent quite a while wondering how best to expose these tunables in a > > way that end users could sensibly use them, and the best I came up with > > was this: > > > > First, run the guest under logdirty for a period of time to establish > > the working set, and how steady it is. From this, you have a baseline > > for the target threshold, and a plausible way of estimating the > > downtime. (Better yet, as XenCenter, XenServers windows GUI, has proved > > time and time again, users love graphs! Even if they don't necessarily > > understand them.) > > > > From this baseline, the conditions you need to care about are the rate > > of convergence. On a steady VM, you should converge asymptotically to > > the measured threshold, although on 5 or fewer iterations, the > > asymptotic properties don't appear cleanly. (Of course, the larger the > > VM, the more iterations, and the more likely to spot this.) > > > > Users will either care about the migration completing successfully, or > > avoiding interrupting the workload. The majority case would be both, > > but every user will have one of these two options which is more > > important than the other. As a result, there need to be some options to > > cover "if $X happens, do I continue or abort". > > > > The case where the VM becomes more busy is harder however. For the > > users which care about not interrupting the workload, there will be a > > point above which they'd prefer to abort the migration rather than > > continue it. For the users which want the migration to complete, they'd > > prefer to pause the VM and take a downtime hit, rather than aborting. > > > > Therefore, you really need two thresholds; the one above which you > > always abort, the one where you would normally choose to pause. The > > decision as to what to do depends on where you are between these > > thresholds when the dirty state converges. (Of course, if the VM > > suddenly becomes more idle, it is sensible to continue beyond the lower > > threshold, as it will reduce the downtime.) The absolute number of > > iterations on the other hand doesn't actually matter from a users point > > of view, so isn't a useful control to have. > > > > Another thing to be careful with is the measure of convergence with > > respect to guest busyness, and other factors influencing the absolute > > iteration time, such as congestion of the network between the two > > hosts. I haven't yet come up with a sensible way of reconciling this > > with the above, in a way which can be expressed as a useful set of controls. > > My thought as well. > > > > The plan, following migration v2, was always to come back to this and > > see about doing something better than the current hard coded parameters, > > but I am still working on fixing migration in other areas (not having > > VMs crash when moving, because they observe important differences in the > > hardware). > > I think a good strategy would be to solicit three parameters from the > user: > - the precopy duration they're willing to tolerate > - the downtime duration they're willing to tolerate > - the bandwidth of the link between the hosts (we could try and estimate > it for them but I'd rather just make them run iperf) > > Then, after applying this patch, alter the policy so that precopy simply > runs for the duration that the user is willing to wait. After that, > using the bandwidth estimate, compute the approximate downtime required > to transfer the final set of dirty-pages. If this is less than what the > user indicated is acceptable, proceed with the stop-and-copy - otherwise > abort. > > This still requires the user to figure out for themselves how long their > workload can really wait, but hopefully they already had some idea > before deciding to attempt live migration in the first place. > I am not entirely sure what to make of this. I'm not convinced using durations would cover all cases, but I can't come up with a counter example that doesn't sound contrived. Given this series is already complex enough, I think we should set this aside for another day. How hard would it be to _not_ include all the knobs in this series? Wei. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.