So here are some details on the SAN LUN...the SAN is a Compellent SAN attached to my FC Switch (McData Sphereon 4700, now the Brocade M4700) with 4 x 2Gb FC connections. The dom0 uses the QLE2462 adapter, with a single 4Gb connection hooked up. I did find that there is a later driver available - I'll try to switch to that when I get a chance. One interesting thing that I found is that it the adapter appears to be in a 4x PCIe slot, which means the max bandwidth for the card is 2.5Gbps. I'm not sure if this is a QLogic issue or if I need to move the card to a different slot in my Dell PowerEdge R610 chassis, but it looks like I'm being limited to 2/3 or so the speed of the FC connection by my PCIe bus. It's using a 4Gbps Point-to-Point connection, with a frame size of 2048. Any hints on whether any of that needs tuning would be great.
I'm not really sure that bandwidth is an issue - perhaps latency more than that. I don't think the amount of data is what's causing the problem; rather the number of transactions that the e-mail system is trying to do on the volume. The file sizes are actually pretty small - 1 to 4 Kb on average, so I think it's the large number of these files that it has to try to read rather than streaming a large amount of data. Both the SAN and the iostat output on both dom0 and domU indicate somewhere between 5000 and 20000 kB/s read rates - that's somewhere around 40Mb/s to 160Mb/s, which is well within the capability of the FC connection. The SAN is indicating I/O operations between 500 and 1500 I/O requests per second, which I assume is what's causing the problem.
Again, any tips on what to look at next would be greatly appreciated! Thanks for all the advice so far!
-Nick
>>> On 2009/08/27 at 03:00, Pasi Kärkkäinen<pasik@xxxxxx> wrote:
On Wed, Aug 26, 2009 at 12:07:55PM -0600, Nick Couchman wrote: > > Doesn't really seem to make a difference which way I do it...I still see pretty intense disk I/O. > > Here is some sample output from iostat in the domU: > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util > xvdb 12.20 0.00 1217.40 26.20 9197.60 530.80 15.65 29.66 23.47 0.80 100.00 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util > xvdb 18.40 0.00 1121.20 19.60 8737.60 691.50 16.53 32.97 29.13 0.88 100.00 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util > xvdb 27.80 0.00 1241.40 29.20 8158.40 377.90 13.44 42.59 33.73 0.79 100.00 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util > xvdb 31.60 0.00 1256.60 35.00 9426.40 424.00 15.25 42.06 32.44 0.77 100.00 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util > xvdb 57.68 0.00 1250.50 17.76 8588.42 352.99 14.10 51.36 40.60 0.79 99.80 > > the avgqu-sz is anywhere from 11 to 75, and the await is anywhere from 20 to 50. %util is always around 100. >
Well.. it seems your SAN LUN is the problem. Have you checked the load from the FC Storage array?
Or then the problem is in your FC HBA. Have you verified the FC link is at full speed?
Are the FC switches OK?
Do you have up-to-date HBA driver in dom0? Are the HBA/Switch/Storage firmwares up-to-date?
-- Pasi
|
<br><hr>
This e-mail may contain confidential and privileged material for the sole use of the intended recipient. If this email is not intended for you, or you are not responsible for the delivery of this message to the intended recipient, please note that this message may contain SEAKR Engineering (SEAKR) Privileged/Proprietary Information. In such a case, you are strictly prohibited from downloading, photocopying, distributing or otherwise using this message, its contents or attachments in any way. If you have received this message in error, please notify us immediately by replying to this e-mail and delete the message from your mailbox. Information contained in this message that does not relate to the business of SEAKR is neither endorsed by nor attributable to SEAKR.
|