[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dom0 PV looping on search_pre_exception_table()


  • To: Manuel Bouyer <bouyer@xxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 9 Dec 2020 19:08:41 +0000
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 09 Dec 2020 19:08:55 +0000
  • Ironport-sdr: X08dhlGsZJzVr+2U3hZEPgxrgbT9Qom8oXWRH+8ryl9HtCOkAvvEKH6bAPOxZzPhl9UGzNuHl5 J/EKh6zQTfSNuwYYUniak5pZx0MjmntKcoQvU4EMgkDKFLGix57rwJxhVvxAAmpnyHDRBIzHbE S2oEMQIiQwzeaHXuOUkcIlwd+8z8x/gS4fc5k+vVRcnPjyqbV/wIwk7zBUt8JZwKsKKl/38kSu KqfCWqaSMns5eVvAO+QrLfFpgNzFlEXKU7hU00QBGhP/uBmnGcA9OZjvHfT2JWbiJw/Kp4DW/A cmU=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 09/12/2020 18:57, Manuel Bouyer wrote:
> On Wed, Dec 09, 2020 at 06:08:53PM +0000, Andrew Cooper wrote:
>> On 09/12/2020 16:30, Manuel Bouyer wrote:
>>> On Wed, Dec 09, 2020 at 04:00:02PM +0000, Andrew Cooper wrote:
>>>> [...]
>>>>>> I wonder if the LDT is set up correctly.
>>>>> I guess it is, otherwise it wouldn't boot with a Xen 4.13 kernel, isn't 
>>>>> it ?
>>>> Well - you said you always saw it once on 4.13, which clearly shows that
>>>> something was wonky, but it managed to unblock itself.
>>>>
>>>>>> How about this incremental delta?
>>>>> Here's the output
>>>>> (XEN) IRET fault: #PF[0000]                                               
>>>>>      
>>>>> (XEN) %cr2 ffff820000010040, LDT base ffffc4800000a000, limit 0057        
>>>>>      
>>>>> (XEN) *** pv_map_ldt_shadow_page(0x40) failed                             
>>>>>      
>>>>> (XEN) IRET fault: #PF[0000]                                               
>>>>>      
>>>>> (XEN) %cr2 ffff820000010040, LDT base ffffc4800000a000, limit 0057        
>>>>>      
>>>>> (XEN) *** pv_map_ldt_shadow_page(0x40) failed                             
>>>>>      
>>>>> (XEN) IRET fault: #PF[0000]                                               
>>>>>   
>>>> Ok, so the promotion definitely fails, but we don't get as far as
>>>> inspecting the content of the LDT frame.  This probably means it failed
>>>> to change the page type, which probably means there are still
>>>> outstanding writeable references.
>>>>
>>>> I'm expecting the final printk to be the one which triggers.
>>> It's not. 
>>> Here's the output:
>>> (XEN) IRET fault: #PF[0000]
>>> (XEN) %cr2 ffff820000010040, LDT base ffffbd000000a000, limit 0057
>>> (XEN) *** LDT: gl1e 0000000000000000 not present
>>> (XEN) *** pv_map_ldt_shadow_page(0x40) failed
>>> (XEN) IRET fault: #PF[0000]
>>> (XEN) %cr2 ffff820000010040, LDT base ffffbd000000a000, limit 0057
>>> (XEN) *** LDT: gl1e 0000000000000000 not present
>>> (XEN) *** pv_map_ldt_shadow_page(0x40) failed
>> Ok.  So the mapping registered for the LDT is not yet present.  Xen
>> should be raising #PF with the guest, and would be in every case other
>> than the weird context on IRET, where we've confused bad guest state
>> with bad hypervisor state.
> Unfortunably it doesn't fix the problem. I'm now getting a loop of
> (XEN) *** LDT: gl1e 0000000000000000 not present                              
>  
> (XEN) *** pv_map_ldt_shadow_page(0x40) failed                                 
>  

Oh of course - we don't follow the exit-to-guest path on the way out here.

As a gross hack to check that we've at least diagnosed the issue
appropriately, could you modify NetBSD to explicitly load the %ss
selector into %es (or any other free segment) before first entering user
context?

If it a sequence of LDT demand-faulting issues, that should cause them
to be fully resolved before Xen's IRET becomes the first actual LDT load.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.