[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] enable port accesses with (almost) full register context
>IMO you're doing code building anyway, but just of one instruction. You get >rid of the locking by doing it to a per-CPU buffer, and the stack is the >obvious place, calling out to register save/restore code. I don't really >care about the performance of the save/restore code -- it's obviously going >to be trivial compared with the unavoidable trap-and-emulate cost. Also, do >you need separate save/restore code for IN vs. OUT instructions? Actually, in the code I currently have I do. This is because for out-s I need to merge the value output with the user-specified rAX, under the assumption that output value and register contents are not always identical (i.e. if particular bits within a port would need to be special treated by Xen, which I can easily imagine to be required at some point). >Something like: > call save_host_restore_guest > <IN or OUT> > call save_guest_restore_host > ret > >Would that be reasonable? It would, provided the above assumption about the need to modify the output value would never become true. Additionally, for 64-bits, I'm concerned about the potential need for using indirect calls here (as well as in the syscall trampolines): there's nothing keeping a user from making the Xen heap 2Gb or more in size. These would further slow things down, but depending on the nature of allocations made from the Xen heap it may also be possible to simply place an upper limit on the heap size, as it currently is assumed adjacent to the Xen image (but taking memory holes at rather low addresses into account a user may even be required to bump the heap size significantly - what if only a few Mb of memory below 4Gb existed? - since, after all, the heap size is the size of address space consumed, not the amount of memory used). >Alternatively, perhaps we could get rid of the distinction and emulate all >port accesses in this way? I suspect that the cost of state save/restore and >building the trampoline is dwarfed by the cost of the GPF and even the cost >of the I/O port access itself (they don't tend to be super fast). Could you >do a few quick measurements to determine this? If the extra cost is less >than, say, 10%, I'd be inclined to take the hit to avoid interface changes. Percentages of full-context relative to simply emulated i/o, without having changed the assembly file approach to the stub building one, yet (as per above issues): PentiumIII (32-bit) with locking 67% PentiumIII (32-bit) without locking 84% Pentium4 (64-bit) with locking 86% Pentium4 (64-bit) without locking 89% Revised patch (domctl->sysctl, naming) attached. Jan Attachment:
xen-x86-io-register-context-2.patch _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |