[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC v2] xSplice design
On 10/30/2015 10:39 AM, Martin Pohlack wrote: On 29.10.2015 17:55, Ross Lagerwall wrote:On 10/27/2015 12:05 PM, Ross Lagerwall wrote:On 06/12/2015 12:39 PM, Martin Pohlack wrote:On 15.05.2015 21:44, Konrad Rzeszutek Wilk wrote: [...]## Hypercalls We will employ the sub operations of the system management hypercall (sysctl). There are to be four sub-operations: * upload the payloads. * listing of payloads summary uploaded and their state. * getting an particular payload summary and its state. * command to apply, delete, or revert the payload. The patching is asynchronous therefore the caller is responsible to verify that it has been applied properly by retrieving the summary of it and verifying that there are no error codes associated with the payload. We **MUST** make it asynchronous due to the nature of patching: it requires every physical CPU to be lock-step with each other. The patching mechanism while an implementation detail, is not an short operation and as such the design **MUST** assume it will be an long-running operation.I am not convinced yet, that you need an asynchronous approach here. The experience from our prototype suggests that hotpatching itself is not an expensive operation. It can usually be completed well below 1ms with the most expensive part being getting the hypervisor to a quiet state.FWIW, my current implementation (which is almost certainly not optimal) tested on a 72 CPU machine takes about 3ms, whether idle or fully loaded.Let me correct that: it takes 60 Îs to 100 Îs to synchronize and apply the patch (on the same hardware) when synchronous console logging is turned off.The interesting (and very rare) case is if other CPUs are busy in Xen already, for example, with memory scrubbing or other long-running activities. Those are hard to interrupt and delay patching activity. Having multiple guests in a reboot-loop / being restarted all the time might help triggering this case. I have been able to trigger this which is why I put in a (currently hard-coded) 10ms timeout in the synchronization code otherwise it gives up and returns an error. It could then be optionally retried by the user at a later point. -- Ross Lagerwall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |