[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] xen/arm: set CPSR Z bit when creating aarch32 guests



On Wed, 23 Mar 2022, Bertrand Marquis wrote:
> > On 22 Mar 2022, at 21:28, Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote:
> > 
> > From: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
> > 
> > The first 32 bytes of zImage are NOPs. When CONFIG_EFI is enabled in the
> > kernel, certain versions of Linux will use an UNPREDICATABLE NOP
> > encoding, sometimes resulting in an unbootable kernel. Whether the
> > resulting kernel is bootable or not depends on the processor. See commit
> > a92882a4d270 in the Linux kernel for all the details.
> > 
> > All kernel releases starting from Linux 4.9 without commit a92882a4d270
> > are affected.
> 
> Can you confirm if those kernels are also affected when started natively ?

Theoretically yes, but in practice only booting on Xen is affected
because:

- the issue cannot happen when booting from u-boot because u-boot sets
  the "Z" bit
- the issue cannot happen when booting with QEMU -kernel because it also
  sets "Z"
- older bootloaders on native skip the first 32 bytes of the start
  address, which also masks this problem

Thus, in practice, I have no idea how one could reproduce the problem on
native.

This info is in the commit message a92882a4d270 on Linux and in-code
comments in the kernel.

 
> > Fortunately there is a simple workaround: setting the "Z" bit in CPSR
> > make it so those invalid NOP instructions are never executed. That is
> > because the instruction is conditional (not equal). So, on QEMU at
> > least, the instruction will end up to be ignored and not generate an
> > exception. Setting the "Z" bit makes those kernel versions bootable
> > again and it is harmless in the other cases.
> 
> I agree with Jan here. This will never be set or should not be expected
> to be set by anyone when started.
> It feels to me that we are introducing an ack for a temporary issue in
> Linux which will makes us derive from the behaviour that could be
> expected on native hardware.
> 
> Could you give more details on how blocking this is ? 

Without this change, none of the Debian arm32 kernels boot on Xen after
Jessie (on QEMU).


> Is the kernel update with the fix available on any of the affected 
> distributions ?

None that I could find. I tried Debian Buster, Debian Bullseye, Debian
testing and the latest Alpine Linux. Happy to try more if you give me a
download link or two.


> Depending on the answers I think we could for example have a config around
> this to flag it as workaround for a specific guest issue so that this is only
> activated when needed.

Also note that this alternative workaround also solves the problem,
however it has other drawbacks as Julien described:
[1] https://marc.info/?l=xen-devel&m=164774063802402


My take on this is the following. PSR_GUEST32_INIT is not part of the
ABI so this cannot be considered an ABI change.

But in any case, given that without this change (or another change [1])
most of the kernels out there don't work, is there a point in discussing
ABI breakages? Basically nothing works right now :-D

I think it makes sense to think whether this change could cause a kernel
that used to boot, not to boot anymore. However, I don't think is
possible because:

- we only support zImage on arm32 and "Z" works well with it
- both u-boot and qemu -kernel set "Z" so we would already now if
  something broke



> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
> > ---
> > Changes in v2:
> > - improve commit message
> > - add in-code comment
> > - move PSR_Z to the beginning
> > ---
> > xen/include/public/arch-arm.h | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> > index 94b31511dd..81cee95f14 100644
> > --- a/xen/include/public/arch-arm.h
> > +++ b/xen/include/public/arch-arm.h
> > @@ -361,6 +361,7 @@ typedef uint64_t xen_callback_t;
> > #define PSR_DBG_MASK    (1<<9)        /* arm64: Debug Exception mask */
> > #define PSR_IT_MASK     (0x0600fc00)  /* Thumb If-Then Mask */
> > #define PSR_JAZELLE     (1<<24)       /* Jazelle Mode */
> > +#define PSR_Z           (1<<30)       /* Zero condition flag */
> > 
> > /* 32 bit modes */
> > #define PSR_MODE_USR 0x10
> > @@ -383,7 +384,12 @@ typedef uint64_t xen_callback_t;
> > #define PSR_MODE_EL1t 0x04
> > #define PSR_MODE_EL0t 0x00
> > 
> > -#define PSR_GUEST32_INIT  
> > (PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC)
> > +/*
> > + * We set PSR_Z to be able to boot Linux kernel versions with an invalid
> > + * encoding of the first 8 NOP instructions. See commit a92882a4d270 in
> > + * Linux.
> > + */
> > +#define PSR_GUEST32_INIT  
> > (PSR_Z|PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC)
> > #define PSR_GUEST64_INIT 
> > (PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_EL1h)
> > 
> > #define SCTLR_GUEST_INIT    xen_mk_ullong(0x00c50078)
> > -- 
> > 2.25.1
> > 
> 



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.