[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/PV: conditionally avoid raising #GP for early guest MSR accesses


  • To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Tue, 3 Nov 2020 17:31:35 +0000
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Tue, 03 Nov 2020 17:31:54 +0000
  • Ironport-sdr: Mm2REViDg1xsQhVFqs0wtoWSRUzOYJIqZ9tDIODDfXplzzSgsKkaU10JZLuKw91WNpbesAD4w2 8t1YLzZZH2p8RccOoU6qpPmU1UaiBnuy7aics1wouNYlM/AbF/embRwpQs923G7qTKKdXPsHn5 YJpJ694mjQRTaZ+Ai3lEDsxmQQAB7Kxu+RQAQYxFSiShZQGjjq2Ox8R1/rN+brPZ9FHuXE8Sko Ao3+so7lCCX1FkvYYjISpas7hDIpbH3+Qmmzf7kOyuJf2PtGYbNQScz+PA+vkkM/J4hf5sdF5n CGc=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 03/11/2020 17:06, Jan Beulich wrote:
> Prior to 4.15 Linux, when running in PV mode, did not install a #GP
> handler early enough to cover for example the rdmsrl_safe() of
> MSR_K8_TSEG_ADDR in bsp_init_amd() (not to speak of the unguarded read
> of MSR_K7_HWCR later in the same function). The respective change
> (42b3a4cb5609 "x86/xen: Support early interrupts in xen pv guests") was
> backported to 4.14, but no further - presumably since it wasn't really
> easy because of other dependencies.
>
> Therefore, to prevent our change in the handling of guest MSR accesses
> to render PV Linux 4.13 and older unusable on at least AMD systems, make
> the raising of #GP on these paths conditional upon the guest having
> installed a handler. Producing zero for reads and discarding writes
> isn't necessarily correct and may trip code trying to detect presence of
> MSRs early, but since such detection logic won't work without a #GP
> handler anyway, this ought to be a fair workaround.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

I appreciate that we probably have to do something, but I don't think
this is a wise move.

Linux is fundamentally buggy.  It is deliberately looking for a
potential #GP fault given its use of rdmsrl_safe().  The reason this bug
stayed hidden for so long was as a consequence of Xen's inappropriate
MSR handling for guests, and the reasons for changing Xen's behaviour
still stand.

This change, in particular, does not apply to any explicitly handled
MSRs, and therefore is not a comprehensive fix.  Nor is it robust to
someone adding code to explicitly handling the impacted MSRs at a later
date (which are are likely to need to do for HWCR), and which would
reintroduce this failure to boot.

We should have the impacted MSRs handled explicitly, with a note stating
this was a bug in Linux 4.14 and older.  We already have workaround for
similar bugs in Windows, and it also gives us a timeline to eventually
removing support for obsolete workarounds, rather than having a "now and
in the future, we'll explicitly tolerate broken PV behaviour for one bug
back in ancient linux".

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.