[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-users] Recommendations for Virtulization Hardware
On 2012-09-27 03:34, ShadesOfGrey wrote:
On 09/24/2012 08:49 AM, Robin
Axelsson wrote:
On 2012-09-24 05:45, ShadesOfGrey wrote:
Sorry for the late response, I've had a lot to
digest.
<snip>
The lack of current information about Xen (and
KVM) online has been frustrating — especially finding the many
proof of concept videos that demonstrated possibilities but
offered no real specifics. Looking for specifics, I sought
information from gaming and enthusiast sites; I figured
finding confirmation of VT-d and AMD-Vi support on such sites
would be more likely. However, I found that wasn't often the
case. I did determine that ASRock motherboards seem to be the
most likely to support VT-d, ASUS least likely (unless
equipped with an Intel 'sanctioned' VT-d chipset). I had
narrowed my choices to two motherboards that appear to offer
VT-d support and was intending to contact the manufacturer
before purchase. Both choices are a bit pricey and I've been
reconsidering whether I should look to other motherboards to
reduce costs.
Some motherboards support IOMMU even though it is not found in
the user manual or specified on the website. Your best bet is to
ask customer support. A guy posted here that he got it working
on an Intel motherboard that doesn't even have options for it in
the BIOS, so it seems that in some cases it is only up to the
CPU. This is not the case with AMD though as I stated before. I
have bought a couple of Gigabyte GA990FX-UD7 myself, they are
stable and have a good layout. They have support for IOMMU but I
haven't tested it thoroughly enough to fully confirm this
although I don't believe there would be any problem.
I'm aware of this. In fact, I only have anecdotal evidence that
the Gigabyte G1.Sniper 3 has IOMMU (VT-d) support. I intend to
query the manufacturers, seeking confirmation of IOMMU support, of
every motherboard that ends up on my short list. I may include
other Z77 motherboards from MSI and Gigabyte, since there is
concrete evidence that the Z77 chipset does support IOMMU. I did
focus, however, on those motherboards I had some inkling could
support IOMMU. Now I'm trying to expand my selection process to
include less expensive options.
It surprises me that ASRock and ASUS are so
different. ASRock is, or at least used to be a subsidiary of
ASUS so there shouldn't be that much difference between them.
Same here. But I guess things changed after ASRock was spun-off.
<snip>
This is precisely the kind of information I was
looking for from the threads I started on Ars Technica. It's
just unfortunate that FLR and D3 D0 support aren't often found
in the tech specs of must expansion hardware. However, now
that I know what to ask, I'll try contacting hardware
manufacturers prior to purchasing any expansion hardware.
Thank you!
D3 and D0 are power states defined for devices in the ACPI
specification and can be used to control the supply voltage
(Vcc) to PCI and PCIe devices. You can find more information
about it here for example:
http://en.wikipedia.org/wiki/Advanced_Configuration_and_Power_Interface
---------------
Device states
The device states D0-D3 are device-dependent:
- D0 Fully On is the operating state.
- D1 and D2 are intermediate power-states
whose definition varies by device.
- D3 Off has the device powered off and
unresponsive to its bus.
---------------
So, either it works for a certain type of hardware or it doesn't
and I wouldn't expect a vendor to state this "support" in the
specifications since it isn't a "feature" in and of itself if
you get me. But maybe this will change and maybe FLR support
will become more widespread.
I did read that Wikipedia entry after your referenced excerpt. To
my mind, if a manufacturer claims support for the ACPI or PCIe
spec and doesn't implement certain portions of those specs(some
are optional after all), said manufacturer has an obligation to
make that clear to your customers and users. But then, that's
just me. Anyway, this information gives me what I need to ask the
right questions of tech support before purchasing.
I think most motherboards should support this.
I've also had it confirmed by nVidia that they support FLR on all
current Quadro cards greater than or equal to Quadro 2000, on Tesla
C2050 and higher and M2050 and higher, and on new VGX and GRID
cards.
<snip>
From everything I've read, solutions that rely on
any form of remote display protocols would be limited to a
subset of Direct3D functions. Furthermore, these would vary
from one implementation to another, thus making them far less
attractive for gaming than VGA passthrough... Well, in my
opinion anyway.
VirtualBox's seamless mode is pretty nifty. But it's a Type 2
Hypervisor and relies on paravirtualized drivers that also
suffer from the same limitations as remote display protocols.
It's great for most things, but gaming is not one of them. And
I'm speaking from personal experience. Though I haven't used
them myself, the same would seem to hold true of Parallel's
and VMWare's 'Workstation' offerings. At least, as far as I've
gathered.
FYI, the Type 1 Hypervisors from Parallel's and VMWare* are
priced waaayyy outside my budget.
I understand that you want full 3D functionality for Windows
gaming but maybe you'll find the subset of 3D functionality for
the Linux machine acceptable. I have looked into VirtualGL and
with TurboVNC, you might get a pretty decent desktop environment
and it seems like most of the features are there already. It
appears that the 3D is rendered by hardware/GPU before it is
streamed through VNC or Spice. So it seems that you would need
another GPU for that. You can find more info on VirtualGL here:
http://www.virtualgl.org/
Upon further investigation, the only option that is available to
remotely translate Windows 3D apps is Microsoft's own RemoteFX.
In which case, Microsoft products would be the foundation of my
software stack... Something I'm trying to avoid.
I was just giving you the options I could think of OTOH that would
let you share a Windows desktop with a Linux desktop, or rather
share the desktop of one VM with the host machine.
Also the line between a type 1 and type 2 hypervisor tend to get
a bit blurry. The point with type 1 is that it has access to
ring-0 so that it can get access directly to the hardware to be
passed through to the guests (I did confuse 'host' and 'guest'
in my prior post). It also doesn't need to ask the host OS for
permission in the same way as a type 2 hypervisor which is
likely to give performance advantages in some cases.
However, even a type 2 hypervisor, although it is run as an
application inside the OS can get "type 1" like privileges. By
patching into the kernel and/or using special "dummy drivers"
for hardware to be shared with VMs you can achieve pretty much
the same thing, ergo it is no longer clear whether the
hypervisor is a type 1 or type 2.
There is an article about it from the old IBM Mainframe days but
I can't seem to find it.
You might be right, the performance of a Type 2 Hypervisor may be
sufficient. Regardless, I'd still have roughly the same hardware
requirements as if I were going to use a Type 1 Hypervisor. I'd
still need the same CPU, RAM, GPU, and storage requirements as
I've already put forth. I could omit the potentially necessary
additional components for use with a Type 1 Hypervisor. Things
like USB controller, NIC, or sound card. But those costs would be
replaced with the cost of licensing the Type2 Hypervisor.
In which case, I don't really gain or lose anything by
experimenting with Type 1 Hypervisors... At any rate, I could
compare and contrast the performance of Type 1 and Type 2
Hypervisors using Paralles' and VMWare's trialware. It might take
a good long while, but it would be an adventure.
*I only found out about VMWare's 'free' vSphere after I'd
written this response.
<snip>
Also, it is highly recommended that
you use ECC RAM for such applications and it doesn't hurt to
dedicate a few gigs of it to the ZFS as RAM is used for
cache. The good news is that most motherboards with good
chipsets support ECC RAM even though you might not find
anything about it in the user manuals.
Again, thanks for the thorough explanation. This gives me a
great deal to think about. The more I learn about ZFS, the
less appealing it becomes. And by that I mean the confusion
over which version of ZFS is in what OS? And just how well
maintained the OSes supporting ZFS are? Now I have additional
hardware considerations to keep in mind that may (or may not)
make the cost of ZFS RAID-Z pool comparable to a hardware
RAID5/6 solution anyway. Do you have any suggestions as to
which of LSI HBAs I should be considering? I haven't found an
HCL for ZFS in my searches.
Out of curiosity — and if you would happen to know — do you
think what you suggest about the HBA and SAS drives for ZFS
also applies to Btrfs? I'm assuming it would, but I'd
appreciate some confirmation.
It's funny how the "I" in RAID never really seems to apply...
Especially since it looks more and more like using ZFS or
Btrfs will require I commit myself, from the start, to one or
the other and a discrete HBA. Transitioning from an integrated
SATA controller(s) and mdadm seems rather impractical. If I
understand what's involved in doing so correctly. It may turn
out that anything other than mdadm is price prohibitive.
I don't think you will have a problem with getting ZFS to run
and if that's your only goal then you don't need to be very
picky with your choice of hardware. I find ZFS pretty easy and
handy to use. I has really great functionality and I don't have
many bad things to say about it so far. ZFS is a filesystem
(along with a couple of software tools to administrate it) just
like EXT4 or NTFS so hardware support depends on the platform it
runs on.
But the point with using ZFS is to get maximum protection
against data corruption and that's where the selection of
hardware gets limited and there are "best practices" set up to
achieve that. I have not tested ZFS on any other platform than
on OpenSolaris and OpenIndiana but I do know that it is well
implemented on that platform and more mature there than on any
other (non-solaris) platform. Another advantage with the OSOL/OI
platform is that the CIFS functionality is implemented in the
kernel space and not in the userland which will give advantages
performance wise if you intend to share files with other windows
computers. (I don't deny that Samba is pretty good on Linux too.
There are some benchmarks on the phoronix website comparing
samba with NFS and they are in favor of Samba on those
benchmarks...) The second best implementation is found with
FreeBSD and it is probably fairly mature but I haven't tested it
myself and some people have run into problems with it in the
past. The Linux version is probably merely at infancy stage and
likely not yet mature enough for regular use. It is probably not
as "bad" as btrfs though. There is quite a bit of information
about it on the phoronix.com website (and probably also at
lwn.net):
http://www.phoronix.com/scan.php?page=news_item&px=MTE4Nzc
My goal wasn't just to experiment with Btrfs, mdamd, or ZFS, if
that's what you meant by, "with getting ZFS to run and if that's
your only goal". I actually intend to use it 'in production'. In
fact, the secondary role (Linux desktop being primary) for my
proposed Virtualization rig is as file server (incl. httpd).
Windows gaming is a tertiary or quaternary concern for me. So,
finding out that the "best practices" for implementing ZFS include
hardware I hadn't anticipated, is a bit off-putting.
The diversity in ZFS implementations is what I meant about being
confused. I'm not as familiar with the underlying platforms that
utilize ZFS (other than Linux, and not with ZFS in use). Without
that familiarity, it makes it more difficult to gauge which of
those platforms would suit my purposes. For example, I had read a
little bit about the degraded CIFS performance on FreeBSD and
Linux due to their reliance on Samba (it residing in user-space
being the issue).
BTW, I meant to ask how you came to the conclusion that ECC RAM is
supported on desktop motherboards? It's always been my
understanding that you could use ECC RAM on such hardware, but
there was no added benefit.
When I looked at the Gigabyte GA990FX motherboard there was no
documentation about it anywhere, nor was it stated in the
specifications. When I contacted customer support and asked about
it, they sent me screen dumps of BIOS showing that it in fact does
support ECC providing different ECC scrubbing options. So you can
ask support if you are unsure.
The benefit of ECC is that there is a parity bit that checks the
integrity of the data that is present in the RAM. We have cosmic
radiation and bit flips do occur, also the memory can turn out to be
faulty. It has been debated whether we really "need" this on a
desktop. The finer litography of the hardware is likely to make it
more sensitive to such failures and bit flips than before so it
makes more sense to use ECC today than in the past. If you are
unlucky, you computer will crash because of that. Some files may get
corrupted in the process. It may not happen very often but it does
happen occasionally, hopefully, the bit flips happen in an address
space that currently is not in use. If you run a server on the other
hand then you are likely to run into freezes and crashes eventually
(it may depend on how long uptimes were talking about). Using ECC
RAM will prevent those crashes and add extra protection. In fact,
Microsoft recommends ECC RAM even on desktop computers.
ZFS does offer protection against data corruption but if you are
unlucky, some data corruption may go by undetected when written from
RAM. ZFS (or any
other filesystem) will write the damaged data to disk and be unable
to
automatically detect the corruption. This is why using ECC RAM with
ZFS is "strongly recommended" in the best practices.
A search there on ZFS will give more articles. The
latest official version of ZFS is 28 and is probably implemented
in both Linux, and FreeBSD by now. Later versions have been
released since Oracle killed the OpenSolaris project and can be
found with the commercial closed-source Solaris platform that is
supplied by Oracle. Things have happened since Oracle pulled the
plug on OSOL project and leading developers behind the ZFS
project such as Jeff Bonwick left Sun (after the acquisition by
Oracle) and joined up with the Illumos team instead. So you
cannot determine the stability of ZFS and ZPOOL merely by
looking into the version number unfortunately and I wouldn't
expect the FreeBSD implementation to be as stable as the Solaris
implementation. It just takes time for the implementation to
mature and the bugs to be weeded out and it just happens to have
been around for Solaris/OpenSolaris/Illumos for much longer than
the other platforms and the Solaris/Illumos version also happens
to get first dibs on the features. Among the Illumos people
there is an ambition to drop the version numbering altogether
and instead talk about available features.
One of the two ZFS implementations on Linux has reached version
28, the other is at version 23 and seems to be abandoned or
stagnant (last release was May 2011). I'm not quite sure what the
status of ZFS on FreeBSD is. From this table
on Wikipedia, ZFS is at version 28. However the table notes that
there is no CIFS or iSCSI support, which I'll try to independently
confirm. And, as you say, the fact that ZFS (and Solaris as a
whole) has essentially been forked, with the re-consolidation of
Solaris as closed-source, just adds to the confusion. It's
possible any given build of Illumos' version 28 of ZFS could have
features or bug fixes no present in Oracle's version 33.
I don't know much about FreeBSD but in Solaris/Illumos the CIFS is
implemented in the kernel space, in platforms such as Linux and BSD
it is provided by the Samba framework in the userspace. If you want
iSCSI then you have full support for it in Illumos based operating
systems.
Also keep in mind that later version storage pools are not readable
on systems with lower versions of zfs/zpool. zpool 28 is implemented
in FreeBSD 9.0 and onwards. To my knowledge the ZFSonLinux project
also has implemented zpool version 28.
It should be noted that people also have had problems with migrating
a ZFS pool from one platform to another even though the version
numbers matched.
The recommendation to use SAS hard drives is not so
much about the quality of the hard drives themselves as it is
about the SAS protocol. The SAS protocol simply handles SCSI
transport commands in a better and more reliable manner than do
SATA. I believe any decent SAS drive would do. As for HBAs I
wrote a list with LSI based hardware a while ago here:
https://www.illumos.org/boards/1/topics/572
the thing is that a lot of OEMs such as IBM, HP, Cisco,
Fujitsu-Siemens, Dell, ... supply their branded HBAs with LSI
circuitry on them. What hardware to choose depends on what
you're looking for. If you want an 8-port controller I would go
for Intel SASUC8I or LSI SAS3801E-R. If you want SAS/SATA3 with
6.0 Gb/s then LSI's LSI SAS3801E-R series cards would be a
better choice. I don't know what OEMs have come up with in the
SATA3 department since I wrote that list but the chips to look
for in that case are the LSI MegaRAID 2004/2008/2016e depending
on how many ports you want.
If you want to read a further discussion about reliability of
different RAID setups I made a post about this in the following
thread (last post):
http://communities.intel.com/thread/25945
The cost of the drives is marginally higher than the drives I had
budgeted, but including the cost of an SAS HBA is problematic. I
hadn't expected purchasing an HBA from the very start. I'd hoped
to deffer such a purchase for a bit, but you are saying that would
be ill-advised. How likely is it I would encounter problems using
SATA drives in the interim?
The problem with "normal" hardware is that the error handling is
internal. If a hard drive is starting to have bad sectors, the drive
will handle them internally and reallocate the bad sectors. You will
not notice anything and the errors will not be reported to the
system. At most you will experience sluggish performance, perhaps a
freeze. When things go so bad that errors are starting to become
visible, then it is too late to do anything about it, the drive is
already dead. In the past the drives would return "garbage", in the
modern days the drive will reread the sector over and over again
while trying to recalibrate the head which gives rise tio the
characteristic click-of-death noise.
If you use a "cheap" HBA with say ZFS, the tools for monitoring the
hardware such as iostat will be pretty much useless. Things will
'look' just fine with no errors and when errors start to show, they
will happen out of the blue with no prior warning. The system drive
of a ZFS file server that I run just crashed like that without
warning. Prior to that, all I had on the drive were some corrupt
blocks of the hard drive image of a virtual machine. It was
connected to the Southbridge OnChip SATA controller of the
motherboard. I didn't cry about it although there were some system
files that I would have liked to keep though. I replaced it with a
server grade SATA drive (WD RE4). I have suffered corruption on that
drive on one of the virtual hard drive images. Since I don't see
much sense in using RAID on the system drive (other people may beg
to differ) I enabled the "ditto blocks" feature on that drive and
recovered the image file using 'ddrescue'. The drive is doing fine
ever since.
On the storage pool of that system I have already replaced two SATA
drives on my SAS controller bacause they were going to die. What
happened was that the server started to freeze but I couldn't tell
what was the problem. Everything looked ok with iostat, was it a
driver issue? I couldn't tell. Then eventually I started to see
'media errors' with iostat and when I replaced the drive the pool
didn't freeze anymore. Then the same thing happened again and I
replaced another hard drive like that. That's the reality of that,
if the drives were SAS I would have been able to tell what's wrong
at an earlier stage. There was a discussion about this about 6
months ago on the openindiana-discuss mailing list.
Have you seen any price/feature advantage to reseller versions of
LSI OEM products? I would think the offerings from Cisco, Dell,
HP, and IBM would come at a premium as opposed to purchasing LSI
branded hardware. Anyway, I thought there might be specific
models from particular manufacturers to meet the "best practices"
for ZFS. However, from your suggestions, I gather it really
doesn't matter as long as the HBA is Intel or LSI?
That's something you have to look for at your dealers. My experience
is that the OEM versions generally are cheaper than LSI branded
products. You can compare yourself here:
Intel SASUC8I
http://amzn.to/QjtOR9
LSI 3081E-R
http://amzn.to/USps9x
Both are essentially the same card but the original LSI one is more
expensive than the one from Intel.
I
wonder, should we perhaps move the discussion of ZFS to private
email? It is a bit off-topic.
I think a better idea is to take this to the openindiana-discuss
mailing list and/or the FreeNAS mailing lists/forums as other people
with ZFS experience will read your posts. If you post on the
openindiana-discuss I will read your post there as well.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users
|
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users
|