Xen project Mailing List

Re: [Xen-devel] Announcing xen/master: pvops git trees rearranged

To: Pasi Kärkkäinen <pasik@xxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

From: Boris Derzhavets <bderzhavets@xxxxxxxxx>

Date: Fri, 16 Oct 2009 00:48:26 -0700 (PDT)

Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 16 Oct 2009 00:49:09 -0700

Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:MIME-Version:Content-Type; b=nQCFjmJ2gclwKHj47QhkNoim8aIaEygdOZoz6CuG3Qca25nuJbCch1ngJc9tQjmOW9yBYMK38u+FQpb7fwemvX+xEJn/3OMdMfNJQwWdsPSIbsLP/YJsK+3a7K3UoXKedv0Nc9HRPJpTb35HAKDJqs9LoGnRrIKjHXbS8NsGqws=;

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Dmesg report for 2.6.31.4 been built on F12 and loaded under Xen 3.4.1 (installed via rawhide 3.4.1-5 ) Dom0 on top of F12 ( yum updated)

[drm] Initialized drm 1.1.0 20060810
[drm] radeon default to kernel modesetting.
[drm] radeon kernel modesetting enabled.
xen: registering gsi 16 triggering 0 polarity 1
xen_allocate_pirq: returning irq 16 for gsi 16
xen: --> irq=16
xen_set_ioapic_routing: irq 16 gsi 16 vector 152 ioapic 0 pin 16 triggering 1 polarity 1
radeon 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
radeon 0000:01:00.0: setting latency timer to 64
[drm] radeon: Initializing kernel modesetting.
[drm:radeon_driver_load_kms] *ERROR* Failed to initialize radeon, disabling IOCTL
radeon 0000:01:00.0: PCI INT A disabled
radeon: probe of 0000:01:00.0 failed with error -22

.   .   .   .   .   .

======================================================
[ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
2.6.31.4 #2
------------------------------------------------------
khubd/28 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
(&retval->lock){......}, at: [<ffffffff81126fa4>] dma_pool_alloc+0x46/0x312

and this task is already holding:
(&ehci->lock){-.....}, at: [<ffffffff814359e4>] ehci_urb_enqueue+0xb4/0xd7c
which would create a new lock dependency:
(&ehci->lock){-.....} -> (&retval->lock){......}

but this new dependency connects a HARDIRQ-irq-safe lock:
(&ehci->lock){-.....}
... which became HARDIRQ-irq-safe at:
[<ffffffff810996d8>] __lock_acquire+0x256/0xc11
[<ffffffff8109a181>] lock_acquire+0xee/0x12e
[<ffffffff81579e9f>] _spin_lock+0x45/0x8e
[<ffffffff814345ec>] ehci_irq+0x41/0x441
[<ffffffff814195d5>] usb_hcd_irq+0x59/0xcc
[<ffffffff810c8200>] handle_IRQ_event+0x62/0x148
[<ffffffff810ca797>] handle_level_irq+0x90/0xf9
[<ffffffff81018038>] handle_irq+0x9a/0xba
[<ffffffff81302342>] xen_evtchn_do_upcall+0x10c/0x1bd
[<ffffffff8101623e>] xen_do_hypervisor_callback+0x1e/0x30
[<ffffffffffffffff>] 0xffffffffffffffff

to a HARDIRQ-irq-unsafe lock:
(purge_lock){+.+...}
... which became HARDIRQ-irq-unsafe at:
... [<ffffffff8109974d>] __lock_acquire+0x2cb/0xc11
[<ffffffff8109a181>] lock_acquire+0xee/0x12e
[<ffffffff81579e9f>] _spin_lock+0x45/0x8e
[<ffffffff81120137>] __purge_vmap_area_lazy+0x63/0x198
[<ffffffff81121a15>] vm_unmap_aliases+0x18f/0x1b2
[<ffffffff8100e400>] xen_alloc_ptpage+0x47/0x75
[<ffffffff8100e46b>] xen_alloc_pte+0x13/0x15
[<ffffffff81115495>] __pte_alloc_kernel+0x6f/0xdd
[<ffffffff81120f42>] vmap_page_range_noflush+0x1c5/0x315
[<ffffffff811210d3>] map_vm_area+0x41/0x6b
[<ffffffff8112122c>] __vmalloc_area_node+0x12f/0x167
[<ffffffff811212f4>] __vmalloc_node+0x90/0xb5
[<ffffffff81121169>] __vmalloc_area_node+0x6c/0x167
[<ffffffff811212f4>] __vmalloc_node+0x90/0xb5
[<ffffffff8112156b>] __vmalloc+0x28/0x3e
[<ffffffff81adb40a>] alloc_large_system_hash+0x12f/0x1fb
[<ffffffff81addc9a>] vfs_caches_init+0xb8/0x140
[<ffffffff81ab5a69>] start_kernel+0x3ef/0x44c
[<ffffffff81ab4d70>] x86_64_start_reservations+0xbb/0xd6
[<ffffffff81ab93b7>] xen_start_kernel+0x5d5/0x5dc
[<ffffffffffffffff>] 0xffffffffffffffff

other info that might help us debug this:

2 locks held by khubd/28:
#0: (usb_address0_mutex){+.+...}, at: [<ffffffff81414344>] hub_port_init+0x8c/0x81e
#1: (&ehci->lock){-.....}, at: [<ffffffff814359e4>] ehci_urb_enqueue+0xb4/0xd7c

the HARDIRQ-irq-safe lock's dependencies:
-> (&ehci->lock){-.....} ops: 0 {
   IN-HARDIRQ-W at:
                        [<ffffffff810996d8>] __lock_acquire+0x256/0xc11
                        [<ffffffff8109a181>] lock_acquire+0xee/0x12e
                        [<ffffffff81579e9f>] _spin_lock+0x45/0x8e
                        [<ffffffff814345ec>] ehci_irq+0x41/0x441
                        [<ffffffff814195d5>] usb_hcd_irq+0x59/0xcc
                        [<ffffffff810c8200>] handle_IRQ_event+0x62/0x148
                        [<ffffffff810ca797>] handle_level_irq+0x90/0xf9
                        [<ffffffff81018038>] handle_irq+0x9a/0xba
                        [<ffffffff81302342>] xen_evtchn_do_upcall+0x10c/0x1bd
                        [<ffffffff8101623e>] xen_do_hypervisor_callback+0x1e/0x30
                        [<ffffffffffffffff>] 0xffffffffffffffff
. . . .

The most recent build 2.6.31.1 on F12 produced clean dmesg output.
Builds 2.6.31.4 ( same commit on top) on F11 and Ubuntu 9.04 Server seem
clean.

Boris.

--- On Thu, 10/15/09, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:

From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: Re: [Xen-devel] Announcing xen/master: pvops git trees rearranged
To: "Pasi Kärkkäinen" <pasik@xxxxxx>
Cc: "Jeremy Fitzhardinge" <jeremy@xxxxxxxx>, "Xen-devel" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Date: Thursday, October 15, 2009, 4:04 PM

On Thu, Oct 15, 2009 at 12:14:15AM +0300, Pasi Kärkkäinen wrote:
> On Mon, Oct 12, 2009 at 04:02:48PM -0400, Konrad Rzeszutek Wilk wrote:
> > On Sun, Oct 11, 2009 at 06:39:00PM +0300, Pasi Kärkkäinen wrote:
> > > On Fri, Sep 18, 2009 at 06:19:41PM -0700, Jeremy Fitzhardinge wrote:
> > > >
> > > > This is definitely a work-in-progress kernel. I'd appreciate all bug
> > > > *and* success reports so I can get some idea of how many people are
> > > > using this thing, and how often there are problems. Patches gratefully
> > > > accepted.
> > > >
> > >
> > > I just tried the latest pv_ops dom0 git tree (11 Oct 2009) on x86_64 AHCI box.
> > >
> > > The good news is that the dom0 kernel boots up, but there are some error
> > > messages.
> > >
> > > Using the default options (modeset) the VGA console doesn't work, it
> > > goes blank (display says "power save") in the beginning of dom0 kernel boot:
> > > http://pasik.reaktio.net/xen/pv_ops-dom0-debug/dmesg-2.6.31.1-2009-10-11.txt
> >
> > This line:
> > [drm:radeon_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xCAFEDEAD)
> >
> > Is a pretty good pointer at what the fault is. If you look at git commit
> > 93e7c3850b8431e19c9cba91413066bfd2360671 you will see the band-aid Jeremy added.
> > It looks though as if not all of the radeon drivers allocate their ring buffer memory via
> > drm_sg_alloc calls thought. Not sure how the r100 (and the corresponding X driver) does it.
> > The long/erro traceback about the HARDIRQ is a red-herring in this case.
> >
> > Here is a couple of things that I would like you to try, if you can:
> >
>
> Sure.
>
> > 1). Pass in 'drm.debug=255' and send the output. It should have tons of extra
> > output.
> >
>
> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/radeondebug/dmesg-2.6.31.1-2009-10-14-drmdebug.txt
>
> Unknown boot option `drm.debug=255': ignoring

I forgot to mention that you probably need to have CONFIG_DRM set to 'y' instead of 'm'
for this to work. Or you could hack up the initrd (modprobe.conf) and make drm load
with the 'debug=255' parameter.

.. snip ..

> seems to work there! (Fedora kernel contains newer graphics/drm drivers).
>
> But the same USB related error is there with the fedora kernel:
> [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
>
> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/radeondebug/dmesg-2.6.31.3-1.2.71.xendom0.fc12.x86_64-2009-10-14.txt

Nah. Still has the same problem:

[drm:r100_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xCAFEDEAD)
[drm:r100_cp_init] *ERROR* radeon: cp isn't working (-

>
>
> > 2). Send in the Xorg.log (or whatever output the program in the userland that
> > starts the modesetting produces). I don't have much knowledge in how modesetting works,
> > so this might require some digging.
> >
>
> Hmm.. yeah, I'm not sure either which is the first program setting up
> graphics mode using kernel modesetting (KMS) in Fedora..
>
> I extracted the initrd image and checked the 'init' script:
>
> echo "Loading drm module"
> modprobe -q drm
> echo "Loading ttm module"
> modprobe -q ttm
> echo "Loading radeon module"
> modprobe -q radeon
> /lib/udev/console_init tty0

add here:
export LIBGL_DEBUG=verbose

> plymouth --show-splash
>
> So I guess plymouth is asking for a graphics mode..

Add this to your kernel command line: plymouth:debug

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel