[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen.git pvops kernel bug: i915 bug after memory upgrade



On Wed, Feb 10, 2010 at 03:57:38PM -0500, Konrad Rzeszutek Wilk wrote:
> On Wed, Feb 10, 2010 at 10:44:59PM +1100, Yasir Assam wrote:
> > I upgraded my RAM from 2GB to 8GB today, and I'm no longer able to run  
> > X. My guess is this is a bug in the xen.git kernel (the dom0 kernel) in  
> > the i915 module. Other kernels (vanilla 2.6.32.x) work fine.
> >
> > I have attached the full dmesg log. The problem is completely  
> > reproducible on my machine.
> 
> 1) Can you give me the hardware specs?

Note: Per personal converstation it was an  Asus P7H55-M Pro which has
Intel H55 chipset or I965..

> 
> .. snip ..
> > [   23.261678] BUG: unable to handle kernel paging request at 
> > ffffc900000c6000
> > [   23.261685] IP: [<ffffffffa0015226>] intel_i915_chipset_flush+0x22/0x3e 
> > [intel_agp]
> > [   23.261694] PGD 33d2067 PUD 33d3067 PMD 33d4067 PTE 0
> > [   23.261700] Oops: 0002 [#1] SMP 
> > [   23.261703] last sysfs file: /sys/module/i2c_core/initstate
> > [   23.261705] CPU 0 
> > [   23.261707] Modules linked in: i915(+) drm i2c_algo_bit video output 
> > ppdev lp parport sco bnep rfcomm l2cap bluetooth rfkill battery 
> > cpufreq_stats cpufreq_userspace cpufreq_conservative cpufreq_powersave fuse 
> > hwmon_vid k8temp eeprom i2c_nforce2 firewire_sbp2 firewire_core crc_itu_t 
> > loop snd_hda_codec_intelhdmi snd_hda_codec_realtek snd_hda_intel 
> > snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi 
> > snd_rawmidi processor snd_seq_midi_event evdev pcspkr i2c_i801 i2c_core 
> > asus_atk0110 snd_seq snd_timer button snd_seq_device acpi_processor snd 
> > soundcore snd_page_alloc ext3 jbd mbcache dm_mod raid1 md_mod sg sd_mod 
> > crc_t10dif sr_mod cdrom usbhid hid pata_jmicron ata_generic ata_piix libata 
> > scsi_mod ide_pci_generic ehci_hcd r8169 mii ide_core usbcore nls_base 
> > intel_agp thermal fan thermal_sys [last unloaded: scsi_wait_scan]
> > [   23.261775] Pid: 2379, comm: modprobe Not tainted 2.6.31.6-pvops-dom0 #7 
> > System Product Name
> > [   23.261777] RIP: e030:[<ffffffffa0015226>]  [<ffffffffa0015226>] 
> > intel_i915_chipset_flush+0x22/0x3e [intel_agp]
> > [   23.261783] RSP: e02b:ffff880002155a58  EFLAGS: 00010286
> > [   23.261785] RAX: 0000000000000001 RBX: ffff88001e0f7300 RCX: 
> > 0000000000001000
> > [   23.261787] RDX: ffffc900000c6000 RSI: 00000000000007e9 RDI: 
> > ffff88001d5efe00
> > [   23.261789] RBP: ffff88001e96c000 R08: 0000000000000040 R09: 
> > ffff8800016f1000
> > [   23.261792] R10: ffff880000000000 R11: 6db6db6db6db6db7 R12: 
> > 0000000000000001
> > [   23.261794] R13: 00000000007e9000 R14: ffff88001e0f7f00 R15: 
> > 00000000007e9000
> > [   23.261799] FS:  00007f1a00ddd6f0(0000) GS:ffffc90000000000(0000) 
> > knlGS:0000000000000000
> > [   23.261801] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [   23.261803] CR2: ffffc900000c6000 CR3: 000000001dd65000 CR4: 
> > 0000000000002660
> > [   23.261806] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> > 0000000000000000
> > [   23.261808] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> > 0000000000000400
> > [   23.261811] Process modprobe (pid: 2379, threadinfo ffff880002154000, 
> > task ffff8800198b8000)
> > [   23.261812] Stack:
> > [   23.261814]  0000000d00000000 000000007ea42086 000000007ea42086 
> > ffffffffa03c387c
> > [   23.261818] <0> ffff88001e0f7f00 000000007ea42086 ffff88001e96c000 
> > ffff88001e0f7300
> > [   23.261823] <0> ffff88001e0f7f00 ffffffffa03c4fc3 ffff88001e0f7300 
> > 0000000000000000
> > [   23.261828] Call Trace:
> > [   23.261840]  [<ffffffffa03c387c>] ? 
> > i915_gem_object_flush_cpu_write_domain+0x30/0x53 [i915]
> > [   23.261849]  [<ffffffffa03c4fc3>] ? 
> > i915_gem_object_set_to_gtt_domain+0x57/0x9d [i915]
> > [   23.261860]  [<ffffffffa03d909a>] ? intelfb_create+0x1e5/0x7a3 [i915]
> > [   23.261866]  [<ffffffff81033525>] ? xen_force_evtchn_callback+0x1d/0x37
> > [   23.261877]  [<ffffffffa03d9a1e>] ? intelfb_probe+0x3c6/0x62e [i915]
> > [   23.261881]  [<ffffffff8103400f>] ? xen_restore_fl_direct_end+0x0/0x1
> > [   23.261894]  [<ffffffffa039d239>] ? 
> > drm_helper_initial_config+0x176/0x19c [drm]
> > [   23.261902]  [<ffffffffa03be2e7>] ? i915_driver_load+0xaa7/0xb3c [i915]
> > [   23.261913]  [<ffffffffa0393399>] ? drm_get_dev+0x321/0x444 [drm]
> > [   23.261919]  [<ffffffff811fc04b>] ? local_pci_probe+0x22/0x3e
> > [   23.261922]  [<ffffffff81033525>] ? xen_force_evtchn_callback+0x1d/0x37
> > [   23.261925]  [<ffffffff811fd30e>] ? pci_device_probe+0x68/0xab
> > [   23.261930]  [<ffffffff81299c91>] ? driver_probe_device+0xa2/0x13a
> > [   23.261933]  [<ffffffff8103400f>] ? xen_restore_fl_direct_end+0x0/0x1
> > [   23.261936]  [<ffffffff81299d8c>] ? __driver_attach+0x63/0x9a
> > [   23.261939]  [<ffffffff81299d29>] ? __driver_attach+0x0/0x9a
> > [   23.261942]  [<ffffffff812990ab>] ? bus_for_each_dev+0x54/0x9d
> > [   23.261945]  [<ffffffff81299674>] ? bus_add_driver+0xbc/0x218
> > [   23.261948]  [<ffffffff8129a185>] ? driver_register+0xa3/0x122
> > [   23.261951]  [<ffffffff811fd5b6>] ? __pci_register_driver+0x5e/0xe7
> > [   23.261959]  [<ffffffffa0383000>] ? i915_init+0x0/0x74 [i915]
> > [   23.261962]  [<ffffffff8100a0f5>] ? do_one_initcall+0x77/0x1c1
> > [   23.261966]  [<ffffffff810ae08f>] ? sys_init_module+0xda/0x223
> > [   23.261970]  [<ffffffff81038fc2>] ? system_call_fastpath+0x16/0x1b
> > [   23.261972] Code: 86 51 06 e1 48 83 c4 18 c3 48 83 ec 18 48 8b 15 f1 80 
> > 00 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 48 85 d2 74 04 b0 01 
> > <89> 02 48 8b 44 24 10 65 48 33 04 25 28 00 00 00 74 05 e8 48 51 
> > [   23.262012] RIP  [<ffffffffa0015226>] intel_i915_chipset_flush+0x22/0x3e 
> > [intel_agp]
> > [   23.262017]  RSP <ffff880002155a58>
> > [   23.262019] CR2: ffffc900000c6000
> > [   23.262022] ---[ end trace cf5e2ee5497e2d52 ]---
> > [   26.955198] eth0: no IPv6 routers present
> > [   27.230515] peth0: no IPv6 routers present

In the latest of PV-OPS kernel (and the 2.6.31.x) there does not seem to be a 
big red
mark on why this would happen. There are two things that I think
might at fault here:
 1). CONFIG_DMAR was not set and you ended up using the non-PCI DMA
     mapping of pages.
 2). We mapped the wrong address.

I am perplexed here. But we can narrow this down.

1) Apply the attached patch.
2) With a working setup (perhaps booting PV-OPS kernel without Xen) but
still with 8GB of RAM, run lspci -vvv and also 'dmesg'.
3) Get a PCI or PCI-e Serial card. I've been using the Rosewill RC-301
and RC-301EU  with success. I had to figure the ioports from 'lspci' and
put this in my Xen command line: "com1=115200,8n1,0xd800,0". The 0xd800
is what lspci told me was on the first IO port of that serial card.
4). Also add to your Xen command line: 'console=com1,vga
guest_loglvl=all"
5). On your Linux kernel command line add: "initcall_debug debug"
6). Compile the kernel and reboot. Make sure to have CONFIG_DMAR=y set.


Attachment: test.patch
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.