Xen project Mailing List

RE: [Xen-users] OS kernels ports, VT, Pacifica & performance

To: mogensv@xxxxxxxxxxxxxxxx, "Sylvain Coutant" <sco@xxxxxxxxxx>

From: "Petersson, Mats" <mats.petersson@xxxxxxx>

Date: Tue, 22 Nov 2005 13:10:02 +0100

Cc: xen-users@xxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 22 Nov 2005 12:10:27 +0000

List-id: Xen user discussion <xen-users.lists.xensource.com>

Thread-index: AcXslEzctuIfUwYCTrCeWs0og+SM7QCwtfug

Thread-topic: [Xen-users] OS kernels ports, VT, Pacifica & performance

I'd like to make a comment here... I do HAVE access to Pacifica hardware, but I can't really comment on it's the performance in numbers that can be compared with other numbers possibly available to others using other hardware [since we haven't released our hardware]. So my comments will be on the basis that the two implementations are relatively equivalent when it comes to the penalties in different places.... Comments inline below: > -----Original Message----- > From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx > [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of > Mogens Valentin > Sent: 18 November 2005 23:03 > To: Sylvain Coutant > Cc: xen-users@xxxxxxxxxxxxxxxxxxx > Subject: Re: [Xen-users] OS kernels ports, VT, Pacifica & performance > > Sylvain Coutant wrote: > > Hi all, > > > > First, I'm not sure this post should have been sent to > xen-devel or here. Please, any list owner : forward to > xen-devel if you read this and feel it should have gone there. > > > > I wonder what will be the advantage, in terms of > performance, of having optimized kernels for XenU when > VT/Pacifica will be there. > > > > AFAIK, using "standard" kernels means emulating peripherals > (network card and so on) on dom0. Xen "optimized" or "ported" > kernels should have a performance advantage. But has this > perf increase already been evaluated ? I don't think anyone has published results for this, as both AMD and Intel have yet to reach public consumption for both hardware and software. Xen 3.0 is still a moving target, for example, and it would be unfair to just grab a snapshot today and publish those results, when someone might figure out next week that you get 5% or 10% or 20% better performance by doing something differently. [No, I haven't got a clue if these performance improvements are realistic, but it's likely that there's some performance improvements to be had from adjusting some things in Xen's hypervisor. This may or may not happen for Xen 3.0]. But there's certainly an advantage in a para-virtualized kernel, compared to a hardware "only" virtualized kernel. In both instances, there's a penalty compared to the non-virtual setup. Say for instance the code contains a "mov %eax, %cr3", which is a single instruction. It's not a single cycle, but maybe a dozen or so [including invalidating the TLB's]. However, compare this to the Para-virtualized case where the move to CR3 becomes a trap into a GP fault handler, that first of all has to determine what instruction it was, then perform the relevant CR3 operation on behalf of the guest [including Dom0], and return from the trap. This is certainly in the hundreds of cycles, with a few dozen used up just in the trap/return-from-trap operations. In the case of a HW virtualized guest, the move to CR3 will cause an intercept leading to VMEXIT, which means that the current state of the guest will be stored, and the correct operation determined [which is somewhat easier, because we now have an exitcode from the VMRUN/VMLAUNCH/VMRESUME instruction], and the data to store into CR3 will have to be dug out of somewhere [perhaps including parsing of the instruction opcode to determine for instance with GP register contains the value]. This, again, is certainly in the hundreds of cycles, if not more. However, the Paravirtualized system can do clever things when it comes to certain operations in the sense that if you modify the original source code in the correct way, common operations can be optimized away from being trapped into a call into the Xen hypervisor, thus reducing the time spent in getting into/out of the hypervisor. In the case of HW virtualized operations, each individual operation needs to be intercepted. > > > > Question behind this : does it worth the work to port some > other OSes to Xen architecture ? > > Not having access to Pacifica/VT hardware (I'd almost kill > for it), there are limits to what can be commented. > > Think of normal multitasking. Handing over the cpu timeslice > from one running process to another means saving the running > process' state and loading another sleeping process' state. Correct, and in this case, very little difference will be seen in the HW or Para-virtualized case. Both will have to do roughly the same thing, and the penalty compared to the non-virtualized situation is > > With virtualization, the whole OS state and pagetable > structure needs to be saved and revoked. A bit more timeconsuming.. > Having virtualisation hardware support for this will be a > real speed-up. > Having hardware support in a cpu with onchip memorycontroller > will mean even more speed-up. I've seen the claim that onchip memory controller will help virtualisation, and I do agree that it will, but only in the same sense that an onchip memory controller helps every other type of memory intensive operations - it gives the processor a more direct link to the memory controller, allowing for more direct communications, including for instance parallelizing certain operations such as L2 cache lookup with the initial stages of memory reads, so that if the L2 lookup is a miss, the memory read has already progressed some cycles down the path, rather than starting the memory read once the L2 lookup was finished. > > Without that onchip memorycontroller, quite a lot still needs > be done in software. Yes, the cpu 'hardware' virtualization > is not just a mix of registers and logic, but also firmware; > however, still rather faster than what the target systems OS > + virtualization mechanism can achieve. This explanation doesn't make sense. With or without onchip memory controller, almost identical work has to be done - intercepting operations that the hypervisor needs to know about, such as Control Register writes, I/O operations, Interrupts, Exceptions, etc, etc. The hypervisor then has to do "the right thing" about the intercepted operation, and return to the guest [or another guest in the case where the guest is waiting for some event like the end of an IO operation]. Without revealing too much about the Pacifica work, I can certainly say that VERY MUCH of it is almost identical to the code in the Intel VT code, with the majority of the differences being the fact that Intel has specific instructions to modify the VMCS, whilst AMD choose to implement a VMCB that is just a block of memory, accessed through the same type of memory operations that any other memory is accessed through. There are other differences, which are all based on the differences in AMD's and Intel's choices when the two companies implemented the HW virtulization support. But in all essential points, the work done is identical and the functionality is very similar. The integrated memory controller helps making the memory reads/writes transition from the processor to the memory faster, which helps the guest/host/hypervisor getting the work done faster, but it doesn't at all simplify the work needed in the hypervisor. > > With current virtualization techniques, guest/domU systems > are always emulated, so having cpu hardware virtualization > doesn't really change that, AFAICS. It'll mean the abilility > to run unpatched OS's, though. Yup, that's exactly how it works. DomU is using emulated hardware [aside from if you manually assign PCI resources through Xen's mechanisms of hiding from Dom0 and showing it to a selected DomU]. The real value of HW virtualization is the ability to run an unmodified OS, whether that means a shrinkwrapped Linux-kernel (or for example an OLD kernel with no virtulization patches available) or one of the commercially available OS's that do not have open/available source-code to patch. > > It's another ballgame further into the future, when the whole > platform and PCIe gets increasingly virtualized. Maybe > sometime around 2008.. I'm sure further work will happen in the future, and we'd see some (or lots) of the hardware adding support to let them run in a virtualized mode - how about a network card that supports multiple guests to write to it, and allows it to have different virtual network addresses based on which guest? -- Mats > > -- > Kind regards, > Mogens Valentin > > > PCIe virtualisation: Imagine cat herding with a firehose > and firecrackers. That is notably easier than getting all > the peripheral makers to play along. > -- fun on theinquirer.net > > > _______________________________________________ > Xen-users mailing list > Xen-users@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-users > > _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.