[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC v2] xSplice design
On 30.10.2015 15:03, Ross Lagerwall wrote: > On 10/30/2015 10:39 AM, Martin Pohlack wrote: >> On 29.10.2015 17:55, Ross Lagerwall wrote: >>> On 10/27/2015 12:05 PM, Ross Lagerwall wrote: >>>> On 06/12/2015 12:39 PM, Martin Pohlack wrote: >>>>> On 15.05.2015 21:44, Konrad Rzeszutek Wilk wrote: >>>>> [...] >>>>>> ## Hypercalls >>>>>> >>>>>> We will employ the sub operations of the system management hypercall >>>>>> (sysctl). >>>>>> There are to be four sub-operations: >>>>>> >>>>>> * upload the payloads. >>>>>> * listing of payloads summary uploaded and their state. >>>>>> * getting an particular payload summary and its state. >>>>>> * command to apply, delete, or revert the payload. >>>>>> >>>>>> The patching is asynchronous therefore the caller is responsible >>>>>> to verify that it has been applied properly by retrieving the summary >>>>>> of it >>>>>> and verifying that there are no error codes associated with the payload. >>>>>> >>>>>> We **MUST** make it asynchronous due to the nature of patching: it >>>>>> requires >>>>>> every physical CPU to be lock-step with each other. The patching >>>>>> mechanism >>>>>> while an implementation detail, is not an short operation and as such >>>>>> the design **MUST** assume it will be an long-running operation. >>>>> >>>>> I am not convinced yet, that you need an asynchronous approach here. >>>>> >>>>> The experience from our prototype suggests that hotpatching itself is >>>>> not an expensive operation. It can usually be completed well below 1ms >>>>> with the most expensive part being getting the hypervisor to a quiet >>>>> state. >>>>> >>>> >>>> FWIW, my current implementation (which is almost certainly not optimal) >>>> tested on a 72 CPU machine takes about 3ms, whether idle or fully loaded. >>>> >>> >>> Let me correct that: it takes 60 Îs to 100 Îs to synchronize and apply >>> the patch (on the same hardware) when synchronous console logging is >>> turned off. >> >> The interesting (and very rare) case is if other CPUs are busy in Xen >> already, for example, with memory scrubbing or other long-running >> activities. Those are hard to interrupt and delay patching activity. >> >> Having multiple guests in a reboot-loop / being restarted all the time >> might help triggering this case. >> > > I have been able to trigger this which is why I put in a (currently > hard-coded) 10ms timeout in the synchronization code otherwise it gives > up and returns an error. It could then be optionally retried by the user > at a later point. If you ever want to run this in QEMU etc. you need to account for the scheduling timeslice of the host system. I found it necessary to work with 20 ms for that specific case. Martin Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |