[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: [patch 13/26] Xen-paravirt_ops: Consistently wrap paravirt ops callsites to make them patchable
Rusty Russell wrote: On Mon, 2007-03-19 at 11:38 -0700, Linus Torvalds wrote:On Mon, 19 Mar 2007, Eric W. Biederman wrote:Quite often, the biggest single win of inlining is not so much the code size (although if done right, that will be smaller too), but the fact that inlining DOES NOT CLOBBER AS MANY REGISTERS!True. You can use all of the call clobbered registers. For VMI, the default clobber was "cc", and you need a way to allow at least that, because saving and restoring flags is too expensive on x86. Thanks Linus. *This* was the reason that the current hand-coded calls only clobber % eax. It was a compromise between native (no clobbers) and others (might need a reg). I still don't think this was a good trade. The primary motivation for clobbering %eax was that Xen wanted a free register to use for computing the offset into the shared data in the case of SMP preemptible kernels. Xen no longer needs such a register, they can use the PDA offset instead. And it does hurt native performance by unconditionally stealing a register in the four most commonly invoked paravirt-ops code sequences. Now, since we decided to allow paravirt_ops operations to be normal C (ie. the patching is optional and done late), we actually push and pop % ecx and %edx. This makes the call site 10 bytes long, which is a nice size for patching anyway (enough for a movl $0, <addr>, a-la lguest's cli, or movw $0, %gs:<addr> if we supported SMP). You can do it in 11 bytes with no clobbers and normal C semantics by linking to a direct address instead of calling to an indirect, but then you need some gross fixup technology in paravirt_patch: if (call_addr == (void*)native_sti) { ... }I think we should probably try to do it in 12 bytes. Freeing eax to the inline caller is likely to make up the 2 bytes of space more we have to nop. One thing I always tried to get in VMI was to encapsulate the actual code which went through the business of computing arguments that were not even used in the native case. Unfortunately, that seems impossible in the current design, but I don't think it is an issue because I don't think there is actually a way to express: SWITCHABLE_CODE_BLOCK_BEGIN { /* arbitrary C code for native */ } SWITCHABLE_CODE_BLOCK_ALTERNATIVE { /* arbitrary C code for something else */ } Dave's linker suggestion is probably the best for things like that. Zach _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |