[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable: AMD-Vi: update_paging_mode Try to access pdev_list without aquiring pcidevs_lock.



On 31/10/2019 08:31, Jan Beulich wrote:
> On 30.10.2019 23:21, Sander Eikelenboom wrote:
>> Call trace seems to be the same in all cases.
> 
> Thanks much.
> 
>> (XEN) [2019-10-30 22:07:05.748] AMD-Vi: update_paging_mode Try to access 
>> pdev_list without aquiring pcidevs_lock.
>> (XEN) [2019-10-30 22:07:05.748] ----[ Xen-4.13.0-rc  x86_64  debug=y   Not 
>> tainted ]----
>> (XEN) [2019-10-30 22:07:05.748] CPU:    1
>> (XEN) [2019-10-30 22:07:05.748] RIP:    e008:[<ffff82d080265748>] 
>> iommu_map.c#update_paging_mode+0x1f2/0x3eb
>> (XEN) [2019-10-30 22:07:05.748] RFLAGS: 0000000000010286   CONTEXT: 
>> hypervisor (d0v2)
>> (XEN) [2019-10-30 22:07:05.748] rax: ffff830523f9ffff   rbx: 
>> ffff82e004905f00   rcx: 0000000000000000
>> (XEN) [2019-10-30 22:07:05.748] rdx: 0000000000000001   rsi: 
>> 000000000000000a   rdi: ffff82d0804a0698
>> (XEN) [2019-10-30 22:07:05.748] rbp: ffff830523f9f848   rsp: 
>> ffff830523f9f808   r8:  ffff8305320a0000
>> (XEN) [2019-10-30 22:07:05.748] r9:  0000000000000038   r10: 
>> 0000000000000002   r11: 000000000000000a
>> (XEN) [2019-10-30 22:07:05.748] r12: ffff82e004905f00   r13: 
>> 0000000000000003   r14: 0000000000000003
>> (XEN) [2019-10-30 22:07:05.748] r15: ffff83041fb83000   cr0: 
>> 0000000080050033   cr4: 00000000000006e0
>> (XEN) [2019-10-30 22:07:05.748] cr3: 000000040a58d000   cr2: ffff8880604835a0
>> (XEN) [2019-10-30 22:07:05.748] fsb: 00007f4b8f899bc0   gsb: 
>> ffff88807d480000   gss: 0000000000000000
>> (XEN) [2019-10-30 22:07:05.748] ds: 0000   es: 0000   fs: 0000   gs: 0000   
>> ss: e010   cs: e008
>> (XEN) [2019-10-30 22:07:05.748] Xen code around <ffff82d080265748> 
>> (iommu_map.c#update_paging_mode+0x1f2/0x3eb):
>> (XEN) [2019-10-30 22:07:05.748]  3d 3b 7b 22 00 00 75 07 <0f> 0b e9 c2 01 00 
>> 00 48 8d 35 1a ce 13 00 48 8d
>> (XEN) [2019-10-30 22:07:05.748] Xen stack trace from rsp=ffff830523f9f808:
> [...]
>> (XEN) [2019-10-30 22:07:05.748] Xen call trace:
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d080265748>] R 
>> iommu_map.c#update_paging_mode+0x1f2/0x3eb
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d080265ded>] F 
>> amd_iommu_map_page+0x72/0x1c2
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d0802583b6>] F 
>> iommu_map+0x98/0x17e
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d0802586fb>] F 
>> iommu_legacy_map+0x28/0x73
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d08034a4a6>] F 
>> p2m-pt.c#p2m_pt_set_entry+0x4d3/0x844
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d080342e13>] F 
>> p2m_set_entry+0x91/0x128
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d080343c52>] F 
>> guest_physmap_add_entry+0x39f/0x5a3
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d080343f85>] F 
>> guest_physmap_add_page+0x12f/0x138
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d0802201ee>] F 
>> memory.c#populate_physmap+0x2e3/0x505
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d0802221e8>] F 
>> do_memory_op+0x695/0x1bf7
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d080383601>] F 
>> pv_hypercall+0x2ca/0x537
>> (XEN) [2019-10-30 22:07:05.748]    [<ffff82d08038a432>] F 
>> lstar_enter+0x112/0x120
> 
> Now this looks to be a pretty common path, i.e. I wonder why no-one
> before has noticed this message getting logged. Fixing, as it seems,
> will require careful auditing of lock nesting, as the PCI devices
> lock will need to be acquired on a path that's entirely unrelated to
> any PCI operation; I'll try to get to this asap. Is there anything
> special about the guest that triggers this?

Not that I am aware of.
I run with amd iommu debug on, others perhaps don't so you don't get the 
message ?
Platform is perhaps what specific (older AMD 890FX chipset) and I need the bios 
workaround:
ivrs_ioapic[6]=00:14.0 iommu=on.
On the other hand, this has ran like this for quite some time.

I have 3 guests (HVM) for which i use PCI passthrough and 
for each of those 3 guests I get this message *once* on start of the guest.
        One guest has a soundcard passed through,
        One guest has a USB2 card passed through,
        One guest has a USB3 card passed through.

Another observation is that both the soundcard and USB2 card
still seem to function despite the message.
The USB3 controller goes haywire though (a lot of driver messages in the guest 
during init).

I could try to bisect, but that would be somewhere next week before I can get 
to that.

At present I run with a tree with as latest commit 
ee7170822f1fc209f33feb47b268bab35541351d,
which is stable for me. This predates some of the IOMMU changes and Anthony's 
QMP work that had
some issues, but that would be the last known real good point for me to start a 
bisect from.

I have attached the complete xl dmesg output.

--
Sander


> Jan
> 

Attachment: xl-dmesg.txt
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.