[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] IO APIC interrupt stuck with irr=1 (was: Re: xen hypervisor does not like my Dom0 LVM partition: I/O Errors)



Adding xen-devel and the x86 maintainers, I hope they can provide more insight 
on this.

On Wed, Nov 30, 2016 at 11:08:45PM +0000, Jeff Swicegood wrote:
> jaga@jaga-Desktop:~$ sudo xl debug-keys i
> jaga@jaga-Desktop:~$ sudo xl dmesg
> (XEN) Xen version 4.7.0 (Ubuntu 4.7.0-0ubuntu2) (stefan.bader@xxxxxxxxxxxxx)
> (gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005) debug=n Fri Oct  7 19:25:16
> UTC 2016
> (XEN) Bootloader: GRUB 2.02~beta2-36ubuntu11.1
> (XEN) Command line: placeholder ioapic_ack=old
> (XEN) Video information:
> (XEN)  VGA is text mode 80x25, font 8x16
> (XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
> (XEN) Disc information:
> (XEN)  Found 2 MBR signatures
> (XEN)  Found 5 EDD information structures
> (XEN) Xen-e820 RAM map:
> (XEN)  0000000000000000 - 000000000009e400 (usable)
> (XEN)  000000000009e400 - 00000000000a0000 (reserved)
> (XEN)  00000000000e2c00 - 0000000000100000 (reserved)
> (XEN)  0000000000100000 - 00000000bf780000 (usable)
> (XEN)  00000000bf780000 - 00000000bf798000 (ACPI data)
> (XEN)  00000000bf798000 - 00000000bf7da000 (ACPI NVS)
> (XEN)  00000000bf7da000 - 00000000c0000000 (reserved)
> (XEN)  00000000fee00000 - 00000000fee01000 (reserved)
> (XEN)  00000000ffe00000 - 0000000100000000 (reserved)
> (XEN)  0000000100000000 - 0000000640000000 (usable)
> (XEN) ACPI: RSDP 000FB940, 0014 (r0 ACPIAM)
> (XEN) ACPI: RSDT BF780000, 0048 (r1 030111 RSDT1120 20110301 MSFT       97)
> (XEN) ACPI: FACP BF780200, 0084 (r1 030111 FACP1120 20110301 MSFT       97)
> (XEN) ACPI: DSDT BF7804B0, C469 (r1  A1682 A1682001        1 INTL 20060113)
> (XEN) ACPI: FACS BF798000, 0040
> (XEN) ACPI: APIC BF780390, 00D8 (r1 030111 APIC1120 20110301 MSFT       97)
> (XEN) ACPI: MCFG BF780470, 003C (r1 030111 OEMMCFG  20110301 MSFT       97)
> (XEN) ACPI: OEMB BF798040, 0072 (r1 030111 OEMB1120 20110301 MSFT       97)
> (XEN) ACPI: HPET BF78F4B0, 0038 (r1 030111 OEMHPET  20110301 MSFT       97)
> (XEN) ACPI: DMAR BF7980C0, 0140 (r1    AMI  OEMDMAR        1 MSFT       97)
> (XEN) ACPI: ASPT BF7984B0, 0034 (r6 030111 PerfTune 20110301 MSFT       97)
> (XEN) ACPI: OSFR BF78F4F0, 00B0 (r1 030111 OEMOSFR  20110301 MSFT       97)
> (XEN) ACPI: SSDT BF79A940, 0363 (r1 DpgPmm    CpuPm       12 INTL 20060113)
> (XEN) System RAM: 24567MB (25156728kB)
> (XEN) Domain heap initialised
> (XEN) IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
> (XEN) IOAPIC[1]: apic_id 9, version 32, address 0xfec8a000, GSI 24-47
> (XEN) Enabling APIC mode:  Flat.  Using 2 I/O APICs
> (XEN) [VT-D]  RMRR address range bf7da000..bf7d9fff not in reserved memory;
> need "iommu_inclusive_mapping=1"?
> (XEN) [VT-D]  RMRR (bf7da000, bf7d9fff) is incorrect
> (XEN) Failed to parse ACPI DMAR.  Disabling VT-d.
> (XEN) Using scheduler: SMP Credit Scheduler (credit)
> (XEN) Detected 3207.349 MHz processor.
> (XEN) Initing memory sharing.
> (XEN) PCI: Not using MCFG for segment 0000 bus 00-ff
> (XEN) I/O virtualisation disabled
> (XEN) ENABLING IO-APIC IRQs
> (XEN)  -> Using old ACK method
> (XEN) Platform timer is 14.318MHz HPET
> (XEN) Allocated console ring of 16 KiB.
> (XEN) VMX: Supported advanced features:
> (XEN)  - APIC MMIO access virtualisation
> (XEN)  - APIC TPR shadow
> (XEN)  - Extended Page Tables (EPT)
> (XEN)  - Virtual-Processor Identifiers (VPID)
> (XEN)  - Virtual NMI
> (XEN)  - MSR direct-access bitmap
> (XEN) HVM: ASIDs enabled.
> (XEN) HVM: VMX enabled
> (XEN) HVM: Hardware Assisted Paging (HAP) detected
> (XEN) HVM: HAP page sizes: 4kB, 2MB
> (XEN) Brought up 8 CPUs
> (XEN) Dom0 has maximum 816 PIRQs
> (XEN) *** LOADING DOMAIN 0 ***
> (XEN)  Xen  kernel: 64-bit, lsb, compat32
> (XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x225f000
> (XEN) PHYSICAL MEMORY ARRANGEMENT:
> (XEN)  Dom0 alloc.:   0000000624000000->0000000628000000 (6160573 pages to
> be allocated)
> (XEN)  Init. ramdisk: 000000063d7a4000->000000063ffffdd5
> (XEN) VIRTUAL MEMORY ARRANGEMENT:
> (XEN)  Loaded kernel: ffffffff81000000->ffffffff8225f000
> (XEN)  Init. ramdisk: 0000000000000000->0000000000000000
> (XEN)  Phys-Mach map: 0000008000000000->0000008002f348c8
> (XEN)  Start info:    ffffffff8225f000->ffffffff8225f4b4
> (XEN)  Page tables:   ffffffff82260000->ffffffff82275000
> (XEN)  Boot stack:    ffffffff82275000->ffffffff82276000
> (XEN)  TOTAL:         ffffffff80000000->ffffffff82400000
> (XEN)  ENTRY ADDRESS: ffffffff81f83180
> (XEN) Dom0 has maximum 8 VCPUs
> (XEN) Scrubbing Free RAM on 1 nodes using 4 CPUs
> (XEN) ..................................................done.
> (XEN) Initial low memory virq threshold set at 0x4000 pages.
> (XEN) Std. Loglevel: Errors and warnings
> (XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
> (XEN) Xen is relinquishing VGA console.
> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input
> to Xen)
> (XEN) Freed 308kB init memory.
> (XEN) IRQ information:
> (XEN)    IRQ:   0 affinity:0001 vec:f0 type=IO-APIC-edge    status=00000000
> time.c#timer_interrupt()
> (XEN)    IRQ:   1 affinity:0040 vec:79 type=IO-APIC-edge    status=00000034
> in-flight=0 domain-list=0:  1(---),
> (XEN)    IRQ:   3 affinity:0001 vec:30 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:   4 affinity:00ff vec:38 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:   5 affinity:0001 vec:40 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:   6 affinity:0001 vec:48 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:   7 affinity:0001 vec:50 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:   8 affinity:0040 vec:81 type=IO-APIC-edge    status=00000030
> in-flight=0 domain-list=0:  8(---),
> (XEN)    IRQ:   9 affinity:0001 vec:60 type=IO-APIC-level   status=00000030
> in-flight=0 domain-list=0:  9(---),
> (XEN)    IRQ:  10 affinity:0001 vec:68 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:  11 affinity:0001 vec:70 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:  12 affinity:0040 vec:71 type=IO-APIC-edge    status=00000030
> in-flight=0 domain-list=0: 12(---),
> (XEN)    IRQ:  13 affinity:00ff vec:88 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:  14 affinity:0001 vec:90 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:  15 affinity:0001 vec:98 type=IO-APIC-edge    status=00000002
> mapped, unbound
> (XEN)    IRQ:  16 affinity:0001 vec:a0 type=IO-APIC-level   status=00000030
> in-flight=0 domain-list=0: 16(---),
> (XEN)    IRQ:  17 affinity:00ff vec:d0 type=IO-APIC-level   status=00000002
> mapped, unbound
> (XEN)    IRQ:  18 affinity:0002 vec:b8 type=IO-APIC-level   status=00000010
> in-flight=0 domain-list=0: 18(---),
> (XEN)    IRQ:  19 affinity:0002 vec:b0 type=IO-APIC-level   status=00000010
> in-flight=0 domain-list=0: 19(---),
> (XEN)    IRQ:  20 affinity:0008 vec:29 type=IO-APIC-level   status=00000030
> in-flight=0 domain-list=0: 20(---),
> (XEN)    IRQ:  21 affinity:0040 vec:a8 type=IO-APIC-level   status=00000030
> in-flight=0 domain-list=0: 21(---),
> (XEN)    IRQ:  22 affinity:00ff vec:a9 type=IO-APIC-level   status=00000002
> mapped, unbound
> (XEN)    IRQ:  23 affinity:0040 vec:c0 type=IO-APIC-level   status=00000030
> in-flight=0 domain-list=0: 23(---),
> (XEN)    IRQ:  28 affinity:00ff vec:89 type=IO-APIC-level   status=00000002
> mapped, unbound
> (XEN)    IRQ:  29 affinity:00ff vec:c8 type=IO-APIC-level   status=00000002
> mapped, unbound
> (XEN)    IRQ:  30 affinity:00ff vec:99 type=IO-APIC-level   status=00000002
> mapped, unbound
> (XEN)    IRQ:  37 affinity:0004 vec:b9 type=IO-APIC-level   status=00000030
> in-flight=0 domain-list=0: 37(---),
> (XEN)    IRQ:  48 affinity:00ff vec:d8 type=PCI-MSI         status=00000002
> mapped, unbound
> (XEN)    IRQ:  49 affinity:00ff vec:21 type=PCI-MSI         status=00000002
> mapped, unbound
> (XEN)    IRQ:  50 affinity:0040 vec:31 type=PCI-MSI/-X      status=00000030
> in-flight=0 domain-list=0:813(---),
> (XEN)    IRQ:  51 affinity:0040 vec:39 type=PCI-MSI/-X      status=00000030
> in-flight=0 domain-list=0:812(---),
> (XEN)    IRQ:  52 affinity:0040 vec:41 type=PCI-MSI/-X      status=00000030
> in-flight=0 domain-list=0:811(---),
> (XEN)    IRQ:  53 affinity:0040 vec:49 type=PCI-MSI/-X      status=00000030
> in-flight=0 domain-list=0:810(---),
> (XEN)    IRQ:  54 affinity:0040 vec:51 type=PCI-MSI/-X      status=00000030
> in-flight=0 domain-list=0:809(---),
> (XEN)    IRQ:  55 affinity:0040 vec:59 type=PCI-MSI/-X      status=00000030
> in-flight=0 domain-list=0:808(---),
> (XEN)    IRQ:  56 affinity:0040 vec:61 type=PCI-MSI/-X      status=00000030
> in-flight=0 domain-list=0:807(---),
> (XEN)    IRQ:  57 affinity:0040 vec:69 type=PCI-MSI/-X      status=00000030
> in-flight=0 domain-list=0:806(---),
> (XEN)    IRQ:  58 affinity:0008 vec:91 type=PCI-MSI         status=00000030
> in-flight=0 domain-list=0:805(---),
> (XEN)    IRQ:  59 affinity:0010 vec:a1 type=PCI-MSI         status=00000010
> in-flight=0 domain-list=0:804(---),
> (XEN)    IRQ:  60 affinity:0040 vec:b1 type=PCI-MSI         status=00000030
> in-flight=0 domain-list=0:803(---),
> (XEN) Direct vector information:
> (XEN)    0x20 -> irq_move_cleanup_interrupt()
> (XEN)    0xf1 -> mce_intel.c#cmci_interrupt()
> (XEN)    0xf2 -> mce_intel.c#intel_thermal_interrupt()
> (XEN)    0xf9 -> pmu_apic_interrupt()
> (XEN)    0xfa -> apic_timer_interrupt()
> (XEN)    0xfb -> call_function_interrupt()
> (XEN)    0xfc -> event_check_interrupt()
> (XEN)    0xfd -> invalidate_interrupt()
> (XEN)    0xfe -> error_interrupt()
> (XEN)    0xff -> spurious_interrupt()
> (XEN) IO-APIC interrupt information:
> (XEN)     IRQ  0 Vec240:
> (XEN)       Apic 0x00, Pin  2: vec=f0 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:1
> (XEN)     IRQ  1 Vec121:
> (XEN)       Apic 0x00, Pin  1: vec=79 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:64
> (XEN)     IRQ  3 Vec 48:
> (XEN)       Apic 0x00, Pin  3: vec=30 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:1
> (XEN)     IRQ  4 Vec 56:
> (XEN)       Apic 0x00, Pin  4: vec=38 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=1 dest_id:1
> (XEN)     IRQ  5 Vec 64:
> (XEN)       Apic 0x00, Pin  5: vec=40 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:1
> (XEN)     IRQ  6 Vec 72:
> (XEN)       Apic 0x00, Pin  6: vec=48 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:1
> (XEN)     IRQ  7 Vec 80:
> (XEN)       Apic 0x00, Pin  7: vec=50 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:1
> (XEN)     IRQ  8 Vec129:
> (XEN)       Apic 0x00, Pin  8: vec=81 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:64
> (XEN)     IRQ  9 Vec 96:
> (XEN)       Apic 0x00, Pin  9: vec=60 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=L mask=0 dest_id:1
> (XEN)     IRQ 10 Vec104:
> (XEN)       Apic 0x00, Pin 10: vec=68 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:1
> (XEN)     IRQ 11 Vec112:
> (XEN)       Apic 0x00, Pin 11: vec=70 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:1
> (XEN)     IRQ 12 Vec113:
> (XEN)       Apic 0x00, Pin 12: vec=71 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:64
> (XEN)     IRQ 13 Vec136:
> (XEN)       Apic 0x00, Pin 13: vec=88 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=1 dest_id:1
> (XEN)     IRQ 14 Vec144:
> (XEN)       Apic 0x00, Pin 14: vec=90 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:1
> (XEN)     IRQ 15 Vec152:
> (XEN)       Apic 0x00, Pin 15: vec=98 delivery=LoPri dest=L status=0
> polarity=0 irr=0 trig=E mask=0 dest_id:1
> (XEN)     IRQ 16 Vec160:
> (XEN)       Apic 0x00, Pin 16: vec=a0 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=0 dest_id:1
> (XEN)     IRQ 17 Vec208:
> (XEN)       Apic 0x00, Pin 17: vec=d0 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=1 dest_id:255
> (XEN)     IRQ 18 Vec184:
> (XEN)       Apic 0x00, Pin 18: vec=b8 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=0 dest_id:2
> (XEN)     IRQ 19 Vec176:
> (XEN)       Apic 0x00, Pin 19: vec=b0 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=0 dest_id:2
> (XEN)     IRQ 20 Vec 41:
> (XEN)       Apic 0x00, Pin 20: vec=29 delivery=LoPri dest=L status=1
> polarity=1 irr=1 trig=L mask=0 dest_id:8

So this IO APIC vector seems to be stuck with irr=1, I've assumed that Xen 
would 
ack the interrupt if a certain timeout has passed and the guest has not done 
it, 
but I could be mistaken. I've also seen similar issues on some boxes, this 
seems 
to always happen on boxes with more than one IO APIC. In the past I've solved 
it 
by setting ioapic_ack=old, but that doesn't seem to work for his case.

Jeff also pasted the output of /proc/interrupts before, which shows that IRQ#20 
is indeed used by the ata controller:

jaga@jaga-Desktop:~$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5  CPU6    
   CPU7
  1:          2          0          0          0          0          0     0    
      0  xen-pirq-ioapic-edge  i8042
  8:          0          0          0          0          0          0     0    
      0  xen-pirq-ioapic-edge  rtc0
  9:          0          0          0          0          0          0     0    
      0  xen-pirq-ioapic-level  acpi
 12:          4          0          0          0          0          0     0    
      0  xen-pirq-ioapic-edge  i8042
 16:        369          0          0          0          0          0     0    
      0  xen-pirq-ioapic-level  uhci_hcd:usb3, ahci[0000:05:00.0]
 18:       1496         75          0          0          0          0     0    
      0  xen-pirq-ioapic-level  ehci_hcd:usb1, uhci_hcd:usb8,firewire_ohci
 19:       4348          0          0          0          0          0   565    
      0  xen-pirq-ioapic-level  uhci_hcd:usb5, uhci_hcd:usb7, eth=0
 20:      12028          0       1404          0          0          0     0    
      0  xen-pirq-ioapic-level  ata_piix, ata_piix
 21:          0          0          0          0          0          0     0    
      0  xen-pirq-ioapic-level  uhci_hcd:usb4
 23:       9373          0          0          0          0          0     0    
      0  xen-pirq-ioapic-level  ehci_hcd:usb2, uhci_hcd:usb6
[...]

> (XEN)     IRQ 21 Vec168:
> (XEN)       Apic 0x00, Pin 21: vec=a8 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=0 dest_id:64
> (XEN)     IRQ 22 Vec169:
> (XEN)       Apic 0x00, Pin 22: vec=a9 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=1 dest_id:255
> (XEN)     IRQ 23 Vec192:
> (XEN)       Apic 0x00, Pin 23: vec=c0 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=0 dest_id:64
> (XEN)     IRQ 28 Vec137:
> (XEN)       Apic 0x01, Pin  4: vec=89 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=1 dest_id:255
> (XEN)     IRQ 29 Vec200:
> (XEN)       Apic 0x01, Pin  5: vec=c8 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=1 dest_id:255
> (XEN)     IRQ 30 Vec153:
> (XEN)       Apic 0x01, Pin  6: vec=99 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=1 dest_id:255
> (XEN)     IRQ 37 Vec185:
> (XEN)       Apic 0x01, Pin 13: vec=b9 delivery=LoPri dest=L status=0
> polarity=1 irr=0 trig=L mask=0 dest_id:4
> 
> 
> And here are the messages to prove there was a lost interrupt:
> 
> 11/30/16 5:09 PM    jaga-Desktop    kernel    [10056.569371] ata2: lost
> interrupt (Status 0x58)
> 11/30/16 5:09 PM    jaga-Desktop    kernel    [10056.569402] ata3: lost
> interrupt (Status 0x58)
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.187813] DMAR-IR: This
> system BIOS has enabled interrupt remapping
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.187813] interrupt
> remapping is being disabled.  Please
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.458493] ACPI: Using
> IOAPIC for interrupt routing
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.517335] ACPI: PCI
> Interrupt Link [LNKA] (IRQs 3 4 6 7 10 *11 12 14 15)
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.517374] ACPI: PCI
> Interrupt Link [LNKB] (IRQs 3 4 6 7 *10 11 12 14 15)
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.517413] ACPI: PCI
> Interrupt Link [LNKC] (IRQs 3 4 6 7 10 11 12 *14 15)
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.517451] ACPI: PCI
> Interrupt Link [LNKD] (IRQs *3 4 6 7 10 11 12 14 15)
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.517489] ACPI: PCI
> Interrupt Link [LNKE] (IRQs 3 4 6 7 10 11 12 14 *15)
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.517527] ACPI: PCI
> Interrupt Link [LNKF] (IRQs 3 4 6 *7 10 11 12 14 15)
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.517565] ACPI: PCI
> Interrupt Link [LNKG] (IRQs 3 4 *6 7 10 11 12 14 15)
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    0.517603] ACPI: PCI
> Interrupt Link [LNKH] (IRQs 3 4 6 7 *10 11 12 14 15)
> 11/30/16 6:00 PM    jaga-Desktop    kernel    [    6.232927] EDAC MC0:
> Giving out device to module i7core_edac.c controller i7 core #0: DEV
> 0000:ff:03.0 (INTERRUPT)

Roger.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.