[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Cheap IOMMU hardware and ECC support importance



Gordan Bobic <gordan@xxxxxxxxxx> writes:

>> On 06/28/2014 08:45 AM, lee wrote:
>>
>> The hardware RAID controller gives me 10fps more with my favourite game
>> I'm playing, compared to software raid.  Since fps rates can be rather
>> low (because I'm CPU limited), that means a significant difference.
>
> If your game is grinding onto disk I/O during play all is lost
> anyway. If your CPU and RAM are _that_ constrained, there is probably
> a better way to spend whatever you might pay for a new caching RAID
> controller these days.

Only I didn't buy the controller new, and I bought it to have a decent
amount of ports.

It's not disk I/O or a lack of RAM that limits the fps rates, it's
actually the CPU (or the whole combination of CPU, board and RAM) not
being able to feed the graphics card fast enough --- or the graphics
card being too fast for the CPU, if you want to see it that way.  To get
a significantly faster system, I'd have to spend ten times or more than
what I payed for the controller.  The CPU alone would cost more.  I
didn't expect any change in fps rates and got the improvement as a
surprising side effect.

>> I don't know about ZFS, though, never used that.  How much CPU overhead
>> is involved with that?  I don't need any more CPU overhead like comes
>> with software raid.
>
> If you are that CPU constrained, tuning the storage is the wrong thing
> to be looking at.

What would you tune without buying a new CPU, board and RAM, and without
running into the same problem of too few SATA ports?

>>>> expensive ones.  Perhaps the lack of ports is not so much of a problem
>>>> with the available disk capacities nowadays; however, it is what
>>>> made me
>>>> get a hardware raid controller.
>>>
>>> Hardware RAID is, IMO, far too much of a liability with
>>> modern disks. Latent sector errors happen a lot more
>>> often than most people realize, and there are error
>>> situations that hardware RAID cannot meaningfully handle.
>>
>> So far, it works very well here.  Do you think that software RAID can
>> handle errors better?
>
> Possibly in some cases.

Cases like?  ZFS, as you described it, might.

>> And where do you find a mainboard that has like
>> 12 SAS/SATA ports?
>
> I use a Marvell 88SX7042 4-port card with two SIL3726 SATA port
> multipliers on it. This works very well for me and provides more
> bandwidth that my 12 disks can serve in a realistic usage pattern.

Do you mean a card like this one:
http://www.hardware-rogge.com/product_info.php?products_id=15226

This card alone costs almost as much as I payed for the RAID controller.

How come you use such a card?  Couldn't you use the on-board SATA ports
and connect a multiplier to them?

> In contrast, I have three SAS RAID cards, two LSI and one Adaptec,
> none of which work at all on my motherboard with the IOMMU enabled.

Hmmm, how come, and what are the symptoms?  Perhaps I should try to
force NUMA to be enabled for the server.

>> It seems that things are getting more and more complicated --- despite
>> they don't need to --- and that people are getting more and more
>> clueless.  More bugs might be a side effect of that, and things aren't
>> done as thoroughly as they used to be done.
>
> Indeed. The chances of getting a filed Fedora bug fixed, or even
> acknowledged before the Fedora's 6-month EOL bug zapper closes it for
> you are vanishlighly small, in my experience.

Yes, it's ridiculous.  I find it really stupid to close a bug because
some amount of time has passed rather than that the bug was looked into
and fixed, or at least checked whether it still exists or not.  That's
not a way to handle bug reports.  People will simply stop making any
because that's the best they can do.

>>> I find that on my motherboard most RAID controllers don't work
>>> at all with IOMMU enabled. Something about the way the transparent
>>> bridging native PCIX RAID ASICs to PCIe makes things not work.
>>
>> Perhaps that's a problem of your board, not of the controllers.
>
> It may well be, but it does show that the idea that a SAS RAID
> controller with many ports is a better solution does not universally
> apply.

I never said it would :)  I was looking at what's available to increase
the number of disks I could connect, and I found you can get relatively
cheap cards with only two ports which may work or not.  More expensive
cards would have four ports and might work or not, and cards with more
than four ports were mostly RAID controllers.  For the 4/4+ cards, the
prices were higher than what I could get the fully featured SAS/SATA
RAID controller with 8 internal ports for, so I got that one --- and
it's been working flawlessly for two years or so now.  Only the server
has problems ...

>>> Cheap SAS cards, OTOH, work just fine, and at a fraction of
>>> the cost.
>>
>> And they provide only a fraction of the ports and features.
>
> When I said SAS above I meant SATA. And PMPs help.

Well, which ones do work?  I didn't find anything to that when I looked
and didn't come across multipliers.

> The combination of SATA card and PMPs supports FIS and NCQ which means
> that the SATA controller's bandwidth per port is used very
> efficiently.

Is that a good thing?  I have a theory that when you have a software
RAID-5 with three disks and another RAID-1 with two disks, you have to
move so much data around that it plugs up the system, causing slowdowns.
Even a software RAID-1 with two disks can create slowdowns, depending on
what data transfer rates the disks can sustain.  I do not have such
slowdowns when the disks are on the hardware RAID controller.

Perhaps it's a problem with the particular board I have and/or the CPU
being too slow to be able to deal with the overhead in such a way that
it's not noticeable, and perhaps it's much better to fill the bandwidth
of a single SATA port rather than using some of the bandwidth of five
SATA ports.  Or perhaps filling the bandwidth of one SATA port plus the
CPU handling the overhead ZFS brings about isn't any better, who knows.

>> Anyway, I have come to like hardware RAID better than software RAID.
>
> Whatever works for you. My view is that traditional RAID, certainly
> anything below RAID6,

Well, you have to afford all the disks for such RAID levels.  Is ZFS any
better in this regard?

> and even on RAID6 I don't trust the closed, opaque, undocumented
> implementation that might be in the firmware, is

It's a big disadvantage of hardware RAID that you can't read the data
when the controller has failed, unless you have another, compatible
controller at hand.  Did you check the sources of ZFS so that you can
trust it?

> no longer fit for purpose with disks of the kind of size that ship
> today.

How would that depend on the capacity of the disks?  More data --> more
potential for errors --> more security required?

>> You could as well argue that graphics cards are evil.
>
> It comes down to what makes a good tool for the job. There are jobs
> that GPUs are good at. When it comes to traditional RAID, there are
> things that are more fit for the the purpose of ensuring data
> integrity.

Always use the right tool.

>> So with VMware, you'd have to get certified hardware.
>
> You wouldn't _have_ to get certified hardware. It just means that if
> you find that there is a total of one motherboard that fits your
> requirements and it's not on the certified list, you can plausibly
> take your chances with it even if it doesn't work out of the box. I
> did that with the SR-2 and got it working eventually in a way that
> would never have been possible with ESX.

Are you saying that for your requirements you couldn't use VMware, which
makes it irrelevant whether the hardware is certified for it or not?

>>>> After all, I'm not convinced that virtualization as it's done with xen
>>>> and the like is the right way to go.
>>> [...]
>>>
>>> I am not a fan of virtualization for most workloads, but sometimes
>>> it is convenient, not least in order to work around deficiencies of
>>> other OS-es you might want to run. For example, I don't want to
>>> maintain 3 separate systems - partitioning up one big system is
>>> much more convenient. And I can run Windows gaming VMs while
>>> still having the advantages of easy full system rollbacks by
>>> having my domU disks backed by ZFS volumes. It's not for HPC
>>> workloads, but for some things it is the least unsuitable solution.
>>
>> Not even for most?  It seems as if everyone is using it quite a lot,
>> make it sense or not.
>
> Most people haven't realized yet that the king's clothes are not
> suitable for every occasion, so to speak. In terms of the hype cycle,
> different users are at different stages. Many are still around the
> point of "peak of inflated expectations". Those that do the testing
> for their particular high performance workloads they were hoping to
> virtualize hit the "trough of disilusionment" pretty quickly most of
> the time.

Why would they think that virtualization benefits things that require
high performance?  When I need the most/best performance possible, it's
obviously counter productive.

> But there ARE things that it is useful for, as I mentioned
> in the paragraph above. Consolidating mostly idle machines and using
> virtualization to augment the ease and convenience backup/restore
> procedures through adding features that don't exist in the guest OS
> are obvious examples of uses that virtualization is very good
> for. That would be the "plateau of productivity".

Indeed, it's really great for that.  Even for good/quite some
performance it makes sense, provided a reasonable combination of
VMs.


-- 
Knowledge is volatile and fluid.  Software is power.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.