[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] Re: AIO for better disk IO? Re: [Xen-users] Getting better Disk IO
I meant, which PV driver? -- Mats > -----Original Message----- > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Liang Yang > Sent: 18 January 2007 17:01 > To: Petersson, Mats; Mark Williamson; > xen-users@xxxxxxxxxxxxxxxxxxx; Xen-Devel > Cc: Tom Horsley; Goswin von Brederlow; James Rivera > Subject: [Xen-devel] Re: AIO for better disk IO? Re: > [Xen-users] Getting better Disk IO > > I'm using 8 Maxtor Atlas SAS II drives and the OS I'm using > is Red Hat > Enterprise Linux 4U4. Both JBOD, MD-RAID0 and MD-RAID5 shows > the consistent > performance gap. > > Liang > > ----- Original Message ----- > From: "Petersson, Mats" <Mats.Petersson@xxxxxxx> > To: "Liang Yang" <multisyncfe991@xxxxxxxxxxx>; "Mark Williamson" > <mark.williamson@xxxxxxxxxxxx>; > <xen-users@xxxxxxxxxxxxxxxxxxx>; "Xen-Devel" > <xen-devel@xxxxxxxxxxxxxxxxxxx> > Cc: "Tom Horsley" <tomhorsley@xxxxxxxxxxxx>; "Goswin von Brederlow" > <brederlo@xxxxxxxxxxxxxxxxxxxxxxxxxxx>; "James Rivera" > <jrivera@xxxxxxxxxxx> > Sent: Thursday, January 18, 2007 2:50 AM > Subject: RE: AIO for better disk IO? Re: [Xen-users] Getting > better Disk IO > > > > > > -----Original Message----- > > From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx > > [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of > Liang Yang > > Sent: 17 January 2007 16:44 > > To: Petersson, Mats; Mark Williamson; > > xen-users@xxxxxxxxxxxxxxxxxxx; Xen-Devel > > Cc: Tom Horsley; Goswin von Brederlow; James Rivera > > Subject: Re: AIO for better disk IO? Re: [Xen-users] Getting > > better Disk IO > > > > Hi Mats, > > > > Thanks for your reply. > > > > You said the HVM domain using PV driver should have the > same disk I/O > > performance as PV guests. However, based on my experiments, > > this is not > > true. I have tried different kinds of I/O benchmark took > (dd, iozone, > > iometer etc.) and they all show there is a big gap between a > > HVM domain with > > PV driver and a PV guest domain. This is especially true for > > large size I/O > > packet (64k, 128k and 256k sequential I/O). So far, the disk > > I/O performance > > HVM w/ PV driver is only 20~30% of PV guests. > > > What driver are you using, and in what OS? > > 20-30% is a lot better than the 10% that I've seen with the > QEMU driver, > but I still expect better results than 30% on the PV > driver... Not that > I have actually tested this, as I have other tasks. > > > Another thing I'm puzzled is disk I/O performance of PV > > guests when tested > > with small size packets (512B and 1K sequential I/O). > > Although the PV guest > > has very close performance to native for large size I/O > > packet, there is > > still a clear gap between them for small size packets (512B > > and 1K). I > > really doubt if Xen hypervisor change the packet coalescing > > behavior for > > small size packets. Do you know if this is true? > > Don't know. I wouldn't think so. > > I would think that the reason for small packets being more > noticable is > that the overhead of the hypervisor becomes much more > noticable than for > a large packet, as the overhead is (almost) constant for the > hypercall(s) involved, but the time used in the driver to actually > perform the disk IO is more dependant on the size of the packet. > > -- > Mats > > > > Liang > > > > ----- Original Message ----- > > From: "Petersson, Mats" <Mats.Petersson@xxxxxxx> > > To: "Liang Yang" <multisyncfe991@xxxxxxxxxxx>; "Mark Williamson" > > <mark.williamson@xxxxxxxxxxxx>; <xen-users@xxxxxxxxxxxxxxxxxxx> > > Cc: "Tom Horsley" <tomhorsley@xxxxxxxxxxxx>; "Goswin von Brederlow" > > <brederlo@xxxxxxxxxxxxxxxxxxxxxxxxxxx>; "James Rivera" > > <jrivera@xxxxxxxxxxx> > > Sent: Wednesday, January 17, 2007 3:07 AM > > Subject: RE: AIO for better disk IO? Re: [Xen-users] Getting > > better Disk IO > > > > > > > -----Original Message----- > > > From: Liang Yang [mailto:multisyncfe991@xxxxxxxxxxx] > > > Sent: 16 January 2007 17:53 > > > To: Petersson, Mats; Mark Williamson; > xen-users@xxxxxxxxxxxxxxxxxxx > > > Cc: Tom Horsley; Goswin von Brederlow; James Rivera > > > Subject: AIO for better disk IO? Re: [Xen-users] Getting > > > better Disk IO > > > > > > Hi Mats, > > > > Let me first say that I'm not an expert on AIO, but I did > sit through > > the presentation of the new blktap driver at the Xen Summit. The > > following is to the best of my understanding, and could be > > "codswallop" > > for all that I know... ;-) > > > > > > I once posted my questions about the behavior of Asynchronous > > > I/O under Xen > > > which is also > > > directly related disk I/O performance. however I did not get > > > any response. I > > > would appreciate > > > if you can advise about this. > > > > > > As AIO can help improve better performance and Linux kernel > > > keeps tuning the > > > AIO path, more and more IOs can be expected to take AIO path > > > instead of > > > regular I/O path. > > > > > > First Question: > > > If we consider Xen, do we need to do AIO both in the > > domain0 and guest > > > domains at the same? For example, considering two situations, > > > let a full > > > virtualized guest domain still do regular I/O and domain0 > > > (vbd back end > > > driver) do AIO; or let both full-virtualized guest domain and > > > domain0 do > > > AIO. What is possible performance difference here? > > > > The main benefit of AIO is that the current requestor (such > as the VBD > > BackEnd driver) can continue doing other things whilst the > > data is being > > read/written to/from the actual storage device. This in turn reduces > > latency where there are multiple requests outstanding from > > the guest OS > > (for example multiple guests requesting "simultaneously" or multiple > > requests issued by the same guest close together). > > > > The bandwidth difference all arises from the reduced latency, not > > because AIO is in itself better performing. > > > > > > > > Second Question: > > > Does Domain0 always wait till AIO data is available and then > > > notify guest > > > domain? or Domain0 will issue an interrupt immediately to > > notify guest > > > domain0 when AIO is queued? If the first case is true, then > > > all AIOs will > > > become synchronous. > > > > The guest can not be issued with an interrupt to signify "data > > available" until the guest's data has been read, so for reads > > at least, > > the effect from the guest's perspective is still synchronous. This > > doesn't mean that the guest can't issue further requests > (for example > > from a different thread, or simply queuing multiple requests to the > > device) and gain from the fact that these requests can be > > started before > > the first issued request is completed (from the backend > > drivers point of > > view). > > > > > > > > > > Third Question: > > > Does Xen hypervisor change the behavior of Linux I/O > > > scheduler more or less? > > > > Don't think so, but I'm by no means sure. In my view, the > > modifications > > to the Linux kernel are meant to be "the minimum necessary". > > > > > > Four Question: > > > Will AIO have different performance impact on > > > para-virtualized domain and > > > full-virtualized domain respectively? > > > > The main difference is the reduction in overhead > > (particularly latency) > > in Dom0, which will affect both PV and HVM guests. HVM guests > > have more > > "other things" happening in Dom0 (such as Qemu work), but > it's hard to > > say which gains more from this without also qualifying what else is > > happening in the system. If you have PV drivers in a HVM domain, the > > disk performance should be about the same, whilst the > > (flawed) benchmark > > of "hdparm" shows around 10x performance difference between > Dom0 and a > > HVM guest - so we loose a lot in the process. I haven't > tried the same > > with tap:aio: instead of file:, but I suspect the > interaction between > > guest, hypervisor and qemu is a much larger component than > > the tap:aio: > > vs file: method of disk access. > > > > -- > > Mats > > > > > > Thanks, > > > > > > Liang > > > > > > ----- Original Message ----- > > > From: "Petersson, Mats" <Mats.Petersson@xxxxxxx> > > > To: "Mark Williamson" <mark.williamson@xxxxxxxxxxxx>; > > > <xen-users@xxxxxxxxxxxxxxxxxxx> > > > Cc: "Tom Horsley" <tomhorsley@xxxxxxxxxxxx>; "Goswin von > Brederlow" > > > <brederlo@xxxxxxxxxxxxxxxxxxxxxxxxxxx>; "James Rivera" > > > <jrivera@xxxxxxxxxxx> > > > Sent: Tuesday, January 16, 2007 10:22 AM > > > Subject: RE: [Xen-users] Getting better Disk IO > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx > > > > [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of > > > > Mark Williamson > > > > Sent: 16 January 2007 17:07 > > > > To: xen-users@xxxxxxxxxxxxxxxxxxx > > > > Cc: Tom Horsley; Goswin von Brederlow; James Rivera > > > > Subject: Re: [Xen-users] Getting better Disk IO > > > > > > > > > I've been hoping to see replies to this, but lacking good > > > > information > > > > > here is the state of my confusion on virtual machine disks: > > > > > > > > > > If you read the docs for configuring disks on domu and > > > hvm machines, > > > > > you'll find a gazillion or so ways to present the disks to > > > > the virtual > > > > > machine. > > > > > > > > There are quite a lot of options, it's true ;-) > > > > > > > > > One of those ways (who's name I forget) provides (if I > > > > understand things, > > > > > which I doubt :-), provides a special kind of disk > > > > emulation designed to > > > > > be driven by special drivers on the virtual machine side. > > > > The combination > > > > > gives near direct disk access speeds in the virtual machine. > > > > > > > > > > The catch is that you need those drives for the kernel on > > > > the virtual > > > > > machine side. They may already exist, you may have to build > > > > them, and > > > > > depending on the kernel version, they may be hard to build. > > > > > > > > > > Perhaps someone who actually understands this could elaborate? > > > > > > > > Basically yes, that's all correct. > > > > > > > > To summarise: > > > > > > > > PV guests (that's paravirtualised, or Xen-native) use a > > > > Xen-aware block device > > > > that's optimised for good performance on Xen. > > > > HVM guests (Hardware Virtual Machine, fully virtualised and > > > > unaware of Xen) > > > > use an emulated IDE block device, provided by Xen (actually, > > > > it's provided by > > > > the qemu-based device models, running in dom0). > > > > > > > > The HVM emulated block device is not as optimised (nor does > > > > it lend itself to > > > > such effective optimisation) for high virtualised > > performance as the > > > > Xen-aware device. Therefore a second option is available for > > > > HVM guests: an > > > > implementation of the PV guest device driver that is able to > > > > "see through" > > > > the emulated hardware (in a secure and controlled way) and > > > > talked directly as > > > > a Xen-aware block device. This can potentially give very > > > > good performance. > > > > > > The reason the emulated IDE controller is quite slow is a > > > consequence of > > > the emulation. The way it works is that the driver in the > HVM domain > > > writes to the same IO ports that the real device would use. > > > These writes > > > are intercepted by the hardware support in the processor and > > > a VMEXIT is > > > issued to "exit the virtual machine" back into the > > hypervisor. The HV > > > looks at the "exit reason", and sees that it's an IO WRITE > > operation. > > > This operation is then encoded into a small packet and > sent to QEMU. > > > QEMU processes this packet and responds back to HV to say > "OK, done > > > that, you may continue". HV then does a VMRUN (or VMRESUME in > > > the Intel > > > case) to continue the guest execution, which is probably > another IO > > > instruction to write to the IDE controller. There's a total > > > of 5-6 bytes > > > written to the IDE controller per transaction, and whilst > > > it's possible > > > to combine some of these writes into a single write, it's > not always > > > done that way. Once all writes for one transaction are > > completed, the > > > QEMU ide emulation code will perform the requested > > operation (such as > > > reading or writing a sector). When that is complete, a > > > virtual interrupt > > > is issued to the guest, and the guest will see this as a > "disk done" > > > interrupt, just like real hardware. > > > > > > All these steps of IO intercepts takes several thousand > > > cycles, which is > > > a bit longer than a regular IO write operation would take > > on the real > > > hardware, and the system will still need to issue the real IO > > > operations > > > to perform the REAL hardware read/write corresponding to > the virtual > > > disk (such as reading a file, LVM or physical partition) at > > > some point, > > > so this is IN ADDITION to the time used by the hypervisor. > > > > > > Unfortunately, the only possible improvement on this > scenario is the > > > type "virtual-aware" driver that is described below. > > > > > > [Using a slightly more efficient model than IDE may also help, but > > > that's going to be marginal compared to the benefits of using a > > > virtual-aware driver]. > > > > > > -- > > > Mats > > > > > > > > I don't know if these drivers are included in any Linux > > > > distributions yet, but > > > > they are available in the Xen source tree so that you can > > > > build your own, in > > > > principle. Windows versions of the drivers are included in > > > > XenSource's > > > > products, I believe - including the free (as in beer) > > > > XenExpress platform. > > > > > > > > There are potentially other options being developed, > > > > including an emulated > > > > SCSI device that should improve the potential for higher > > > > performance IO > > > > emulation without Xen-aware drivers. > > > > > > > > Hope that clarifies things! > > > > > > > > Cheers, > > > > Mark > > > > > > > > -- > > > > Dave: Just a question. What use is a unicyle with no seat? > > > > And no pedals! > > > > Mark: To answer a question with a question: What use is a > > > skateboard? > > > > Dave: Skateboards have wheels. > > > > Mark: My wheel has a wheel! > > > > > > > > _______________________________________________ > > > > Xen-users mailing list > > > > Xen-users@xxxxxxxxxxxxxxxxxxx > > > > http://lists.xensource.com/xen-users > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Xen-users mailing list > > > Xen-users@xxxxxxxxxxxxxxxxxxx > > > http://lists.xensource.com/xen-users > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@xxxxxxxxxxxxxxxxxxx > > http://lists.xensource.com/xen-users > > > > > > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-users > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel > > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |