[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-users] Intel Quad NIC made visible in guest -> system crash


  • To: xen-users@xxxxxxxxxxxxxxxxxxx
  • From: Alexander Menk <alex.menk.lists2@xxxxxxxxxxxxxx>
  • Date: Wed, 10 Jun 2009 11:39:32 +0300
  • Delivery-date: Wed, 10 Jun 2009 01:40:19 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=subject:from:to:in-reply-to:references:content-type:date:message-id :mime-version:x-mailer:content-transfer-encoding; b=Y1xE2uaX/dUoVF876c8FWsJ/ZUQeWhNHecxWnA2dTkaVRfMNGfTTwEGHU0FHTguFdt 2c/rciTN6HjH6LG6jWGdlC6V3Hill7MiJX8vxBcCGUhs5jFl71nbF+ZaZLI7Teskcd6j hijvGkRh9GTK78t4/oaBrw493pbv7j/5kC+50=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>


On Mi, 2009-06-10 at 08:22 +0000, Fischer, Anna wrote:
> > Subject: RE: [Xen-users] Intel Quad NIC made visible in guest -> system
> > crash
> > 
> > 
> > On Di, 2009-06-09 at 20:50 +0000, Fischer, Anna wrote:
> > > > Subject: RE: [Xen-users] Intel Quad NIC made visible in guest ->
> > system
> > > > crash
> > > >
> > > > Hi!
> > > >
> > > > thanks for the reply.
> > > >
> > > > On Di, 2009-06-09 at 11:38 +0000, Joseph L. Casale wrote:
> > > > > >we have two Intel Quad Nic 82576, PCI ID 8086:10E8 and use the
> > igb
> > > > > >driver 1.3.19.3 on Debian 5.0.1.
> > > > > >
> > > > > >I used the pciback.hide XEN kernel parameter and made on of the
> > > > NIC's
> > > > > >interfaces available in a DomU.
> > > > > >
> > > > > >Now, when I am starting the VM, the system crashes (log
> > attached)
> > > > >
> > > > > W/o doing any research myself, I vaguely remember someone here
> > having
> > > > > similar results and suggesting that some nics have a design such
> > that
> > > > > some ports are tied together as a result of sharing components on
> > the
> > > > > nic itself. Basically, you may have a nic that is really only two
> > > > > independent nics, each with two ports so you have to pass two in
> > at
> > > > > once etc.
> > > > >
> > > > > A quick search or test should validate this...
> > > >
> > > > I already blacklisted all 4 ports of the whole nic. Next I
> > blacklisted
> > > > the igb module in dom0 as suggested in
> > > > http://lists.xensource.com/archives/html/xen-users/2007-
> > > > 10/msg00598.html
> > > > were Stephan Seitz recommends to not use the module in the dom0.
> > > >
> > > > I also disabled MSI interrupts in the igb driver (make
> > > > CFLAGS_EXTRA=-DDISABLE_PCI_MSI install) as the igb readme says
> > there
> > > > might be some problems.
> > > >
> > > > Now, when starting the domU, I do not get the message anymore that
> > IRQ
> > > > #17 was disabled, but still:
> > > >
> > > > [  623.361836] ACPI: PCI Interrupt 0000:10:00.1[B] -> GSI 17
> > (level,
> > > > low) -> IRQ 17
> > > > [  623.362307] pciback 0000:10:00.1: Driver tried to write to a
> > > > read-only configuration space field at offset 0xa8, size 2. This
> > may be
> > > > harmless, but if you have problems with your device:
> > > > [  623.362310] 1) see permissive attribute in sysfs
> > > > [  623.362311] 2) report problems to the xen-devel mailing list
> > along
> > > > with details of your device obtained from lspci.
> > > > [  623.362771] PCI: Setting latency timer of device 0000:10:00.1 to
> > 64
> > > >
> > > > When doing ifup eth0 inside the domU, I get the message that the
> > cable
> > > > is not connected.
> > > >
> > > > Platform is amd64 with 2 Intel Xeon CPUs with 4 cores.
> > > >
> > > > On many places I read to use the boot option pciback.permissive -
> > > > unfortunately my kernel does not support that setting. I would have
> > > > been
> > > > happy to avoid recompiling the kernel, and I read that pciback
> > should
> > > > work without the permissive flag as well.
> > > >
> > > > Any ideas? please ...
> > >
> > > I am assuming that you are not using the SR-IOV capabilities of the
> > device?
> > 
> > no I don't. How is the current support status in XEN?
> > 
> > >
> > > The 82576 is a multi-function device. If you do an lspci -t then you
> > should see that all ports have the same bus/slot number and only differ
> > in the last digit which is the function ID. I believe that with the
> > current Xen PCI pass-through you have to co-assign all device residing
> > under the same PCI bridge to a single guest domain. So you cannot only
> > assign a single port to a guest.
> > >
> > > You can also see under /proc/interrupts who is using IRQ 17 (that was
> > disabled due to an interrupt clash). I guess that something in your
> > Dom0 is also using it.
> > 
> > The usb devices seem to use this interrupt as well:
> > 
> >  16:       3796          0          0          0          0          0
> > 0          0  Phys-irq-level     arcmsr
> >  17:          0          0          0          0          0          0
> > 0          0  Phys-irq-level     uhci_hcd:usb1, ehci_hcd:usb4
> >  18:        737          0          0          0          0          0
> > 0          0  Phys-irq-level     uhci_hcd:usb3, eth0
> >  19:       4468          0          0          0          0          0
> > 0          0  Phys-irq-level     uhci_hcd:usb2, peth1
> 
> This should not show your peth1 and eth0 device if you have properly disabled 
> those in Dom0.

why? eth0 and eth1 are onboard interfaces, eth2-5 and eth6-9 are on two
intel quad NICs.
eth1 is using bridging.

> Is this the output of the running Xen system when the guest is running
> too?
yes, the guest is already running. It's the output after the "see
permissive attribute in sysfs" messages in syslog.

I am wondering if the solution is just as easy as compiling a kernel
that supports setting that permissive attribute? But somehow I don't
fell well with that and maybe that will mess up things even more?

Regards,

Alexander






_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.