[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
On Wed, 2011-08-24 at 21:24 +0100, Konrad Rzeszutek Wilk wrote: > On Mon, Aug 22, 2011 at 10:00:11AM +0100, Ian Campbell wrote: > > @xen-devel: > > > > Does this look familiar to anyone, this is (I expect, hopefully Giuseppe > > will confirm) from Debian Squeeze which has a Xen 4.0.x with a PVops > > dom0 kernel based on xen.git from last summer (e73f4955a821) with more > > recent upstream longterm kernels (up to and including 2.6.32.41) merged > > in. While it does seem to have the switch from level to edge triggered > > interrupt the Debian kernel doesn't appear to have the switch to fasteoi > > for pirqs (0672fb44a111 plus a few followups) -- could that be related > > to this? (I'm not sure if that was a cleanup or a fix) > > It was a fix. We had some interrupts getting wedged - but I don't recall > the stack exactly. OK, sounds very much like those fixes are worth a try then. Thanks. > But there are some follows - like > e5ac0bda96c495321dbad9b57a4b1a93a5a72e7f > 7e186bdd0098b34c69fb8067c67340ae610ea499 The list of changesets against drivers/xen/events.c which are not in the Debian kernel which I came up with is below [0]. A small number are false positives (Debian already got them via the longterm branches) but most are not. The majority look like real fixes to me either for this particular issue or other problems. I would consider them all candidates for inclusion in a future update of the Debian kernel. Giuseppe, are you able to reproduce the issue you are seeing at will? If I build a test kernel would you be able to try it? You are using a -686 kernel right (as opposed to amd64). OOI which hypervisor flavour do you use? > The interesting about the stack trace is that it looks similiar to: > > http://groups.google.com/group/linux.kernel/browse_thread/thread/39a397566cafc979 > > which has some fixes https://patchwork.kernel.org/patch/1091772/ > but they may not help. Looks like it is an issue on native too. If it is an issue as far back as 2.6.32 as well I expect we'll see the fix via the longterm channels at some point. Ian. [0] 652c98bac315a2253628885f05cfd5f30b553ae5 xen: Use IRQF_FORCE_RESUME f9f09329407e3a11140827ba71d8f9d9ede42823 xen: events: do not unmask event channels on resume ea2020837ca7dc2c9bcfc477fb4d261cf067db4f xen: do not try to allocate the callback vector again at restore time acad13511ebe1db666aab5807117d3ac647ea58d xen: events: Remove redundant clear of l2i at end of round-robin loop 0e2ec1fb16f9ca84f91de3d9427a0964d679738a xen: events: Make round-robin scan fairer by snapshotting each l2 word 188449f889c6c30709c7e9e8710b9eff14fd963f xen: events: Clean up round-robin evtchn scan. 1acdebd2d67f71d230f5857c28843e636b7dd92e xen: events: Make last processed event channel a per-cpu variable. 2d9c33e1b47b800e43a1444a65353fcb96e27165 xen: events: Process event channels notifications in round-robin order. 2b1c9503c615f68262ae2e96ee26ee128b486287 xen/events: only unmask irq if enabled c756a6e7f711308ce85afc7d4c79213cce58a033 xen: allocate irq descriptors on any numa node b1a003a2aa9ee0d3d69237725c91839f4b6a8559 xen/events: use locked set|clear_bit() for cpu_evtchn_mask cca68cf2d344eb3c4ff996e99f36cf8f8382bc2b xen/evtchn: clear secondary CPUs' cpu_evtchn_mask[] after restore c7ff70d2824191af119091d3af8db3bb57b06f77 xen: events: do not unmask event channels on resume d4283609c7504309b8b93d7582857ff4623105f3 xen: improvements to VIRQ_DEBUG output 7c42097171f2e0beafa16e007a06e464b3014bea xen: correct parameter type for pirq_eoi 97708051c14157e95e25d112c26902f1c6fbb462 xen: ensure that all event channels start off bound to VCPU 0 e05885b24a55db82fbdb5cbc3f31426b976d7fc1 xen: set up IRQ before binding virq to evtchn f0d4a0552f03b52027fb2c7958a1cbbe210cf418 xen/apic: fix pirq_eoi_gmfn resume d2ea486300ca6e207ba178a425fbd023b8621bb1 xen/pirq: use fasteoi for MSI too 158d6550716687486000a828c601706b55322ad0 xen/pirq: use eoi as enable 2390c371ecd32d9f06e22871636185382bf70ab7 xen/events: use PHYSDEVOP_pirq_eoi_gmfn to get pirq need-EOI info cb23e8d58ca35b6f9e10e1ea5682bd61f2442ebd xen/evtchn: correction, pirq hypercall does not unmask 43d8a5030a502074f3c4aafed4d6095ebd76067c xen/evtchn: pirq_eoi does unmask f4526f9a78ffb3d3fc9f81636c5b0357fc1beccd xen/evtchn: make pirq enable/disable unmask/mask c6a16a778f86699b339585ba5b9197035d77c40f xen/evtchn: rename retrigger_dynirq -> irq d0936845a856816af2af48ddf019366be68e96ba xen/evtchn: rename enable/disable_dynirq -> unmask/mask_irq 2789ef00cbe2cdb38deb30ee4085b88befadb1b0 xen: make pirq interrupts use fasteoi 0672fb44a111dfb6386022071725c5b15c9de584 xen/events: change to using fasteoi 9fa90aa72d6af5cc2c2eddf56f9a586035e13ae7 xen: use dynamic_irq_init_keep_chip_data f55ce8740101c54016544a0d633dc1b6b21244ae Introduce CONFIG_XEN_PVHVM compile option f61692642a2a2b83a52dd7e64619ba3bb29998af xen/pirq: do EOI properly for pirq events 47cd3eb068a8a0cea124495e525ac16876fa08f6 xen/pci: fix compile error when CONFIG_PCI_XEN disabled 29a2e2a7bd19233c62461b104c69233f15ce99ec xen/apic: use handle_edge_irq for pirq events 6dc7b8080195ed43ee6de5b1d60c65aa719208ad xen/irq: replace boot boot allocator 66fd3052fec7e7c21a9d88ba1a03bc062f5fb53d xen: handle events as edge-triggered 8401e9b96f80f9c0128e7c8fc5a01abfabbfa021 xen: use percpu interrupts for IPIs and VIRQs -- Ian Campbell A Fortran compiler is the hobgoblin of little minis. Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |