[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable-staging: Xen BUG at iommu_map.c:455



Wednesday, April 1, 2015, 1:38:34 AM, you wrote:

> On 31/03/2015 22:11, Sander Eikelenboom wrote:
>> Hi all,
>>
>> I just tested xen-unstable staging (changeset: git:0522407-dirty) 
>>
>> with revert of commit 1aeb1156fa43fe2cd2b5003995b20466cd19a622
>> (due to an already reported but not yet resolved issue)
>>
>> and build with qemu xen from 
>> git://xenbits.xen.org/staging/qemu-upstream-unstable.git
>> (to include the pci command register patch from Jan)
>>
>>
>> and now came across this new splat when starting an HVM with PCI passtrhough:

> Wow - you are getting all the fun bugs at the moment!

> Nothing has changed in the AMD IOMMU driver for a while, but the
> BUG_ON() is particularly unhelpful at identifying what went wrong.

> As a first pass triage, can you rerun with

> diff --git a/xen/drivers/passthrough/amd/iommu_map.c
> b/xen/drivers/passthrough/amd/iommu_map.c
> index 495ff5c..f15c324 100644
> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -451,8 +451,9 @@ static int iommu_pde_from_gfn(struct domain *d,
> unsigned long pfn,
>      table = hd->arch.root_table;
>      level = hd->arch.paging_mode;

> -    BUG_ON( table == NULL || level < IOMMU_PAGING_MODE_LEVEL_1 ||
-            level >> IOMMU_PAGING_MODE_LEVEL_6 );
> +    BUG_ON(table == NULL);
> +    BUG_ON(level < IOMMU_PAGING_MODE_LEVEL_1);
+    BUG_ON(level >> IOMMU_PAGING_MODE_LEVEL_6);

>      next_table_mfn = page_to_mfn(table);

> which will help identify which of the conditions is failing.

> Can you please also provide the full serial log, including iommu=debug?

> ~Andrew


Hi Andrew,

Finally got some time to figure this out .. and i have narrowed it down to:
git://xenbits.xen.org/staging/qemu-upstream-unstable.git
commit 7665d6ba98e20fb05c420de947c1750fd47e5c07 "Xen: Use the ioreq-server API 
when available"
A straight revert of this commit prevents the issue from happening.

The reason i had a hard time figuring this out was:
- I wasn't aware of this earlier, since git pulling the main xen tree, doesn't 
  auto update the qemu-* trees.
- So i happen to get this when i cloned a fresh tree to try to figure out the 
  other issue i was seeing.
- After that checking out previous versions of the main xen tree didn't resolve 
  this new issue, because the qemu tree doesn't get auto updated and is set 
  "master".
- Cloning a xen-stable-4.5.0 made it go away .. because that has a specific 
  git://xenbits.xen.org/staging/qemu-upstream-unstable.git tag which is not 
  master.

*sigh* 

This is tested with xen main tree at last commit 
3a28f760508fb35c430edac17a9efde5aff6d1d5
(normal xen-unstable, not the staging branch)

Ok so i have added some extra debug info (see attached diff) and this is the 
output when it crashes due to something the commit above triggered, the 
level is out of bounds and the pfn looks fishy too.
Complete serial log from both bad and good (specific commit reverted) are 
attached.

--
Sander

Here is a snipped from the bad one:

(XEN) [2015-04-10 09:58:15.204] d1: hd->arch.paging_mode:2
(XEN) [2015-04-10 09:58:15.220] d1: hd->arch.paging_mode:2
(XEN) [2015-04-10 09:58:15.236] d1: hd->arch.paging_mode:2
(XEN) [2015-04-10 09:58:15.251] d1: hd->arch.paging_mode:2
(XEN) [2015-04-10 09:58:15.267] d1: hd->arch.paging_mode:2
(XEN) [2015-04-10 09:58:15.282] AMD-Vi: ?!?!? update_paging_mode level after:8 
(XEN) [2015-04-10 09:58:15.303] AMD-Vi: ?!?!? amd_iommu_map_page level after 
update paging mode:8 
(XEN) [2015-04-10 09:58:15.329] AMD-Vi: ?!?!? iommu_pde_from_gfn: domid:1 
table:1 level:8 pfn:0xffffffffffffffff
(XEN) [2015-04-10 09:58:15.359] Xen BUG at iommu_map.c:459
(XEN) [2015-04-10 09:58:15.375] ----[ Xen-4.6-unstable  x86_64  debug=y  
Tainted:    C ]----
(XEN) [2015-04-10 09:58:15.399] CPU:    4
(XEN) [2015-04-10 09:58:15.410] RIP:    e008:[<ffff82d080155b2c>] 
iommu_pde_from_gfn+0x82/0x47a
(XEN) [2015-04-10 09:58:15.436] RFLAGS: 0000000000010202   CONTEXT: hypervisor
(XEN) [2015-04-10 09:58:15.456] rax: 0000000000000000   rbx: 0000000000000008   
rcx: 0000000000000000
(XEN) [2015-04-10 09:58:15.483] rdx: ffff830256f00000   rsi: 000000000000000a   
rdi: ffff82d0802986c0
(XEN) [2015-04-10 09:58:15.510] rbp: ffff830256f07ad8   rsp: ffff830256f07a78   
r8:  ffff830256f30000
(XEN) [2015-04-10 09:58:15.537] r9:  0000000000000003   r10: 0000000000000149   
r11: 0000000000000003
(XEN) [2015-04-10 09:58:15.563] r12: ffff82e0023aaa60   r13: 0000000000000000   
r14: 0000000000000000
(XEN) [2015-04-10 09:58:15.590] r15: 00007d2000000000   cr0: 0000000080050033   
cr4: 00000000000006f0
(XEN) [2015-04-10 09:58:15.617] cr3: 00000002347fc000   cr2: ffff88001dd58ee0
(XEN) [2015-04-10 09:58:15.638] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 
e010   cs: e008
(XEN) [2015-04-10 09:58:15.663] Xen stack trace from rsp=ffff830256f07a78:
(XEN) [2015-04-10 09:58:15.682]    ffff83024e6c9000 ffff830256f07b30 
ffffffffffffffff ffff830200000018
(XEN) [2015-04-10 09:58:15.710]    ffff830256f07ae8 ffff830256f07aa8 
ffff82e0049f7700 0000000000000003
(XEN) [2015-04-10 09:58:15.737]    ffff82e0049f76e0 00000000000000e9 
0000000000000000 00007d2000000000
(XEN) [2015-04-10 09:58:15.764]    ffff830256f07b98 ffff82d0801560a1 
0000000000000003 000000010000010b
(XEN) [2015-04-10 09:58:15.791]    000000000024fbb7 ffffffffffffffff 
0000000200000001 ffff83024e6c9938
(XEN) [2015-04-10 09:58:15.818]    ffff820040030ff8 ffff83024e6c9000 
ffff82d0802986c0 0000000000000000
(XEN) [2015-04-10 09:58:15.845]    0000000000000000 0000000000000000 
0000000000000000 0000000000000000
(XEN) [2015-04-10 09:58:15.872]    0000000000000000 0000000000000000 
0000000000002000 ffff83024e6c9000
(XEN) [2015-04-10 09:58:15.899]    ffff82e0049f76e0 00000000000000e9 
0000000000000000 00007d2000000000
(XEN) [2015-04-10 09:58:15.926]    ffff830256f07bf8 ffff82d08015a56d 
0000000000000000 ffff83024e6c9020
(XEN) [2015-04-10 09:58:15.953]    ffff830256f00000 000000000024fbb7 
ffff830256f07bf8 0000000000000000
(XEN) [2015-04-10 09:58:15.980]    ffff83024e6c9000 0000000000000800 
ffff83024e6c9000 0000000000000000
(XEN) [2015-04-10 09:58:16.007]    ffff830256f07c98 ffff82d08014c607 
ffff830256f07c78 ffff82d08012c178
(XEN) [2015-04-10 09:58:16.034]    0000000000000003 ffff830256f07c28 
0000000000000020 0000000000000000
(XEN) [2015-04-10 09:58:16.061]    0000000000000000 0000000000000000 
0000000800000000 00007fc31c3e1004
(XEN) [2015-04-10 09:58:16.088]    ffff830256eb8f40 ffff83025cc6d300 
ffff82d080330c60 00007fc31c3e1004
(XEN) [2015-04-10 09:58:16.115]    ffff83024e6c9000 00007fc31c3e1004 
ffff83024e6c9000 0000000000000005
(XEN) [2015-04-10 09:58:16.142]    ffff830256f07ca8 ffff82d08014900b 
ffff830256f07d98 ffff82d080161f2d
(XEN) [2015-04-10 09:58:16.169]    0000000000000020 0000000000000004 
0000000000000003 0000000000000001
(XEN) [2015-04-10 09:58:16.196]    ffff82d080331bb8 0000000000000001 
ffff830256f07de8 ffff82d080120c10
(XEN) [2015-04-10 09:58:16.223] Xen call trace:
(XEN) [2015-04-10 09:58:16.236]    [<ffff82d080155b2c>] 
iommu_pde_from_gfn+0x82/0x47a
(XEN) [2015-04-10 09:58:16.259]    [<ffff82d0801560a1>] 
amd_iommu_map_page+0x17d/0x58e
(XEN) [2015-04-10 09:58:16.281]    [<ffff82d08015a56d>] 
arch_iommu_populate_page_table+0x179/0x4d8
(XEN) [2015-04-10 09:58:16.307]    [<ffff82d08014c607>] 
iommu_do_pci_domctl+0x3b7/0x630
(XEN) [2015-04-10 09:58:16.331]    [<ffff82d08014900b>] 
iommu_do_domctl+0x17/0x1a
(XEN) [2015-04-10 09:58:16.352]    [<ffff82d080161f2d>] 
arch_do_domctl+0x2469/0x26e1
(XEN) [2015-04-10 09:58:16.375]    [<ffff82d08010497f>] do_domctl+0x1a1f/0x1d60
(XEN) [2015-04-10 09:58:16.396]    [<ffff82d080234c6b>] syscall_enter+0xeb/0x145
(XEN) [2015-04-10 09:58:16.417] 
(XEN) [2015-04-10 09:58:16.426] 
(XEN) [2015-04-10 09:58:16.435] ****************************************
(XEN) [2015-04-10 09:58:16.454] Panic on CPU 4:
(XEN) [2015-04-10 09:58:16.467] Xen BUG at iommu_map.c:459
(XEN) [2015-04-10 09:58:16.482] ****************************************
(XEN) [2015-04-10 09:58:16.501] 
(XEN) [2015-04-10 09:58:16.510] Reboot in five seconds...

Attachment: iommu-debug.patch
Description: Binary data

Attachment: serial-log-bad
Description: Binary data

Attachment: serial-log-good
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.