[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Using debug-key 'o: Dump IOMMU p2m table, locks up machine



Hello Wei,

Monday, September 3, 2012, 5:20:55 PM, you wrote:

> On 09/02/2012 05:14 PM, Sander Eikelenboom wrote:
>> Sunday, September 2, 2012, 4:58:58 PM, you wrote:
>>
>>> On 02/09/2012 09:43, "Sander Eikelenboom"<linux@xxxxxxxxxxxxxx>  wrote:
>>
>>>>> Quite simply, there likely needs to be more tracing on the IOMMU fault 
>>>>> path.
>>>>> That's a separate concern from your keyhandler of course, but just saying
>>>>> I'd be looking for the former rather than the latter, for diagnosing
>>>>> Sander's bug.
>>>>
>>>> Are there any printk's I could add to get more relevant info about the 
>>>> AMD-Vi:
>>>> IO_PAGE_FAULT ?
>>
>>> No really straightforward one. I think we need a per-IOMMU-type handler to
>>> walk the IOMMU page table for a given virtual address, and dump every
>>> page-table-entry on the path. Like an IOMMU version of show_page_walk().
>>> Personally I would suspect this is more useful than the dump-everything
>>> handlers: just give a *full* *detailed* walk for the actually interesting
>>> virtual address (the one faulted on).
>>
>>>> I have attached new output from xl dmesg, this time with iommu=debug on 
>>>> (the
>>>> option changed from 4.1 to 4.2).
>>
>>> Not easy to glean any more from that, without extra tracing such as
>>> described above, and/or digging into the guest to find what driver-side
>>> actions are causing the faults.
>>
>> OK, too bad!
>> With xen 4.1 i haven't experienced those page faults, but a diff between 
>> /xen/drivers/passthrough/amd in both trees show quite some changes :(

> Did you also update xen tools accordingly? Sometime I also saw a few 
> IO_PAGE_FAULTs came from nic if my tools version and HV version did not 
> match. But using recent 4.2 and corresponding xl, my tests went well.
> BTW: You could also try iommu=no-sharept to see if it helps.

Tried it and it doesn't help.
I now even got a "xl dmesg" which shows a IO_PAGE_FAULT occuring very early, 
before any toolstack or guest can be involved:

(XEN) [2012-09-04 15:51:17] AMD-Vi: Setup I/O page table: device id = 0x0a05, 
root table = 0x24d84b000, domain = 0, paging mode = 3
(XEN) [2012-09-04 15:51:17] AMD-Vi: Setup I/O page table: device id = 0x0a06, 
root table = 0x24d84b000, domain = 0, paging mode = 3
(XEN) [2012-09-04 15:51:17] AMD-Vi: Setup I/O page table: device id = 0x0a07, 
root table = 0x24d84b000, domain = 0, paging mode = 3
(XEN) [2012-09-04 15:51:17] AMD-Vi: Setup I/O page table: device id = 0x0b00, 
root table = 0x24d84b000, domain = 0, paging mode = 3
(XEN) [2012-09-04 15:51:17] Scrubbing Free RAM: 
...........................<0>AMD-Vi: IO_PAGE_FAULT: domain = 0, device id = 
0x0a06, fault address = 0xc2c2c2c0
(XEN) [2012-09-04 15:51:18] ............................................done.
(XEN) [2012-09-04 15:51:19] Initial low memory virq threshold set at 0x4000 
pages.
(XEN) [2012-09-04 15:51:19] Std. Loglevel: All
(XEN) [2012-09-04 15:51:19] Guest Loglevel: All
(XEN) [2012-09-04 15:51:19] Xen is relinquishing VGA console.


Complete dmesg attached.

> Thanks,
> Wei

>>>   -- Keir
>>
>>>>
>>>>
>>>>>   -- Keir
>>>>
>>
>>
>>
>>
>>
>>






-- 
Best regards,
 Sander                            mailto:linux@xxxxxxxxxxxxxx

Attachment: xl-dmesg.txt
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.