[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dom0 PV looping on search_pre_exception_table()


  • To: Manuel Bouyer <bouyer@xxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Thu, 10 Dec 2020 17:18:39 +0000
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 10 Dec 2020 17:18:55 +0000
  • Ironport-sdr: 3MmlqKstwSYTXm0HfeOnmb2teyx2tQn+CHBet464J0xI+5zswoZNcFQiJZjzw4qTCnIjlviT4P 2YUguIgKNkWoLVA4t+yYSa7fYmZinR58cHO0RGz4MpCpJsWUq8Cv2VHlf9xWNn9fGG0+lhO7Q/ Kkn12ZwqQrS/PTUo0ZqhnYyxQxFCuEXi4m362EUmCMBbQJlZnBXyL32nASweu4bdp1hnzhYfY/ fP0BKX0+IaC4xeTCSkPgDdHezKKGr3pQ+9ZWhW/MZ8GIvOMurUWJi4+IjXXg4eUb8Ro+UDr3Um Y2c=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 10/12/2020 17:03, Manuel Bouyer wrote:
> On Thu, Dec 10, 2020 at 03:51:46PM +0000, Andrew Cooper wrote:
>>> [   7.6617663] cs 0x47  ds 0x23  es 0x23  fs 0000  gs 0000  ss 0x3f
>>> [   7.7345663] fsbase 000000000000000000 gsbase 000000000000000000
>>>
>>> so it looks like something resets %fs to 0 ...
>>>
>>> Anyway the fault address 0xffffbd800000a040 is in the hypervisor's range,
>>> isn't it ?
>> No.  Its the kernel's LDT.  From previous debugging:
>>> (XEN) %cr2 ffff820000010040, LDT base ffffbd000000a000, limit 0057
>> LDT handling in Xen is a bit complicated.  To maintain host safety, we
>> must map it into Xen's range, and we explicitly support a PV guest doing
>> on-demand mapping of the LDT.  (This pertains to the experimental
>> Windows XP PV support which never made it beyond a prototype.  Windows
>> can page out the LDT.)  Either way, we lazily map the LDT frames on
>> first use.
>>
>> So %cr2 is the real hardware faulting address, and is in the Xen range. 
>> We spot that it is an LDT access, and try to lazily map the frame (at
>> LDT base), but find that the kernel's virtual address mapping
>> 0xffffbd000000a000 is not present (the gl1e printk).
>>
>> Therefore, we pass #PF to the guest kernel, adjusting vCR2 to what would
>> have happened had Xen not mapped the real LDT elsewhere, which is
>> expected to cause the guest kernel to do whatever demand mapping is
>> necessary to pull the LDT back in.
>>
>>
>> I suppose it is worth taking a step back and ascertaining how exactly
>> NetBSD handles (or, should be handling) the LDT.
>>
>> Do you mind elaborating on how it is supposed to work?
> Note that I'm not familiar with this selector stuff; and I usually get
> it wrong the first time I go back to it.
>
> AFAIK, in the Xen PV case, a page is allocated an mapped in kernel
> space, and registered to Xen with MMUEXT_SET_LDT.
> From what I found, in the common case the LDT is the same for all processes.
> Does it make sense ?

The debugging earlier shows that MMUEXT_SET_LDT has indeed been called. 
Presumably 0xffffbd000000a000 is a plausible virtual address for NetBSD
to position the LDT?

However, Xen finds the mapping not-present when trying to demand-map it,
hence why the #PF is forwarded to the kernel.

The way we pull guest virtual addresses was altered by XSA-286 (released
not too long ago despite its apparent age), but *should* have been no
functional change.  I wonder if we accidentally broke something there. 
What exactly are you running, Xen-wise, with the 4.13 version?

Given that this is init failing, presumably the issue would repro with
the net installer version too?

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.