|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [XEN v2] xen/arm64: io: Decode 32-bit ldr/str post-indexing instructions
Hi Ayan,
> On 30 Nov 2021, at 19:13, Ayan Kumar Halder <ayan.kumar.halder@xxxxxxxxxx>
> wrote:
>
> Hi Andre,
>
> Thanks for your comments. They are useful.
>
> On 30/11/2021 09:49, Andre Przywara wrote:
>> On Mon, 29 Nov 2021 19:16:38 +0000
>> Ayan Kumar Halder <ayan.kumar.halder@xxxxxxxxxx> wrote:
>> Hi,
>>> At the moment, Xen is only handling data abort with valid syndrome (i.e.
>>> ISV=0). Unfortunately, this doesn't cover all the instructions a domain
>>> could use to access MMIO regions.
>>>
>>> For instance, Xilinx baremetal OS will use:
>>>
>>> volatile u32 *LocalAddr = (volatile u32 *)Addr;
>>> *LocalAddr = Value;
>>>
>>> This leave the compiler to decide which store instructions to use.
>> As mentioned in the other email, this is wrong, if this points to MMIO:
>> don't let the compiler do MMIO accesses. If a stage 2 fault isn't in
>> an MMIO area, you should not see traps that you cannot handle already.
>> So I don't think it's a good idea to use that as an example. And since
>> this patch only seems to address this use case, I would doubt its
>> usefulness in general.
> Yes, I should have fixed the comment.
>
> Currently, I am testing with baremetal app which uses inline assembly code
> with post indexing instructions, to access the MMIO.
>
> ATM, I am testing with 32 bit MMIO only.
>
> On the usefulness, I am kind of torn as it is legitimate for post indexing
> instructions to be used in an inline-assembly code for accessing MMIO.
> However, that may not be something commonly seen.
>
> @Stefano/Bertrand/Julien/Volodymyr :- As you are the Arm mantainers, can you
> comment if we should have decoding logic or not ?
Andre gave you the official statement from Arm and there is nothing more I can
say.
I will leave this decision to Stefano and Julien.
Regards
Bertrand
>
>>> This
>>> may be a post-index store instruction where the HW will not provide a
>>> valid syndrome.
>>>
>>> In order to handle post-indexing store/load instructions, Xen will need
>>> to fetch and decode the instruction.
>>>
>>> This patch only cover post-index store/load instructions from AArch64
>>> mode. For now, this is left unimplemented for trap from AArch32 mode.
>>>
>>> Signed-off-by: Ayan Kumar Halder <ayankuma@xxxxxxxxxx>
>>> ---
>>>
>>> Changelog :-
>>>
>>> v2 :- 1. Updated the rn register after reading from it. (Pointed by
>>> Julien, Stefano)
>>> 2. Used a union to represent the instruction opcode (Suggestd by
>>> Bertrand) 3. Fixed coding style issues (Pointed by Julien)
>>> 4. In the previous patch, I was updating dabt->sign based on the
>>> signedness of imm9. This was incorrect. As mentioned in ARMv8 ARM DDI
>>> 0487G.b, Page 3221, SSE indicates the signedness of the data item
>>> loaded. In our case, the data item loaded is always unsigned.
>>>
>>> This has been tested for the following cases :-
>>> ldr x2, [x1], #4
>> As Jan already mentioned: this is a bad example. First, this is a 64-bit
>> access, which you don't emulate below. And second, you want to keep the
>> pointer aligned. Unaligned accesses to device memory always trap, as per
>> the architecture, even on bare metal.
>>>
>>> ldr w2, [x1], #-4
>>>
>>> str x2, [x1], #4
>> Same as above.
>>> str w2, [x1], #-4
>>>
>>> The reason being I am testing on 32bit MMIO registers only. I don't see
>>> a 8bit or 16bit MMIO register.
>> Where did you look? There are plenty of examples out there, even the GIC
>> allows 8-bit accesses to certain registers (grep for "VGIC_ACCESS_"), and
>> the Linux GIC driver is using them (but with proper accessors, of course).
>> Also GICv3 supports 64-bit accesses to some registers. Some PL011 UARTs use
>> 16-bit MMIO accesses.
> Yes, sorry I see them now. GICD_IPRIORITYR can be accessed with 8 bits.
> Unfortunately, I have GIC-v2 on my hardware system. So, probably I can't test
> 64 bit access.
>
>>> xen/arch/arm/decode.c | 68 ++++++++++++++++++++++++++++++++++++++-
>>> xen/arch/arm/decode.h | 3 +-
>>> xen/arch/arm/io.c | 40 +++++++++++++++++++----
>>> xen/include/asm-arm/hsr.h | 26 +++++++++++++++
>>> 4 files changed, 129 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/decode.c b/xen/arch/arm/decode.c
>>> index 792c2e92a7..0b3e8fcbc6 100644
>>> --- a/xen/arch/arm/decode.c
>>> +++ b/xen/arch/arm/decode.c
>>> @@ -84,6 +84,66 @@ bad_thumb2:
>>> return 1;
>>> }
>>> +static int decode_32bit_loadstore_postindexing(register_t pc,
>>> + struct hsr_dabt *dabt,
>>> + union ldr_str_instr_class
>>> *instr)
>>> +{
>>> + if ( raw_copy_from_guest(&instr->value, (void * __user)pc, sizeof
>>> (instr)) )
>>> + return -EFAULT;
>>> +
>>> + /* First, let's check for the fixed values */
>>> + if ( !((instr->code.fixed1 == 1) && (instr->code.fixed2 == 0) &&
>>> + (instr->code.fixed3 == 0) && (instr->code.fixed4 == 7)) )
>>> + {
>>> + gprintk(XENLOG_ERR, "Decoding not supported for instructions other
>>> than"
>>> + " ldr/str post indexing\n");
>>> + goto bad_32bit_loadstore;
>>> + }
>>> +
>>> + if ( instr->code.size != 2 )
>> I don't see a good reason for this limitation. If you are going to dissect
>> the instruction, why not just support at least all access widths, so
>> 64-bits, but also {ldr,str}{b,w}? I think the framework does the heavy
>> lifting for you already?
>
> I see your point. My intention was to test first with the restricted
> instruction set (ie ldr/str - 32 bit access with post indexing only) and get
> an opinion from the community. If the patch looks sane, then this can be
> extended with other variants as well.
>
>> Same for the restriction to post-index above, supporting pre-index as well
>> should be easy.
> For Pre-indexing instruction, the ISS is valid. So I am not sure what is to
> be done here?
>
> AFAIU, if the ISS is valid, there is no need to explicitly decode the
> instructions.
>> To me this has the bitter taste for being a one trick pony to work around
>> your particular (broken!) use case.
>>> + {
>>> + gprintk(XENLOG_ERR,
>>> + "ldr/str post indexing is supported for 32 bit variant
>>> only\n");
>>> + goto bad_32bit_loadstore;
>>> + }
>>> +
>>> + if ( instr->code.v != 0 )
>>> + {
>>> + gprintk(XENLOG_ERR,
>>> + "ldr/str post indexing for vector types are not supported\n");
>>> + goto bad_32bit_loadstore;
>>> + }
>>> +
>>> + /* Check for STR (immediate) - 32 bit variant */
>>> + if ( instr->code.opc == 0 )
>>> + {
>>> + dabt->write = 1;
>>> + }
>>> + /* Check for LDR (immediate) - 32 bit variant */
>>> + else if ( instr->code.opc == 1 )
>>> + {
>>> + dabt->write = 0;
>>> + }
>>> + else
>>> + {
>>> + gprintk(XENLOG_ERR,
>>> + "Decoding ldr/str post indexing is not supported for this
>>> variant\n");
>>> + goto bad_32bit_loadstore;
>>> + }
>>> +
>>> + gprintk(XENLOG_INFO,
>>> + "instr->code.rt = 0x%x, instr->code.size = 0x%x, instr->code.imm9
>>> = %d\n",
>>> + instr->code.rt, instr->code.size, instr->code.imm9);
>>> +
>>> + update_dabt(dabt, instr->code.rt, instr->code.size, false);
>>> + dabt->valid = 1;
>>> +
>>> + return 0;
>>> +bad_32bit_loadstore:
>>> + gprintk(XENLOG_ERR, "unhandled 32bit Arm instruction 0x%x\n",
>>> instr->value);
>>> + return 1;
>>> +}
>>> +
>>> static int decode_thumb(register_t pc, struct hsr_dabt *dabt)
>>> {
>>> uint16_t instr;
>>> @@ -150,11 +210,17 @@ bad_thumb:
>>> return 1;
>>> }
>>> -int decode_instruction(const struct cpu_user_regs *regs, struct hsr_dabt
>>> *dabt)
>>> +int decode_instruction(const struct cpu_user_regs *regs, struct hsr_dabt
>>> *dabt,
>>> + union ldr_str_instr_class *instr)
>>> {
>>> if ( is_32bit_domain(current->domain) && regs->cpsr & PSR_THUMB )
>>> return decode_thumb(regs->pc, dabt);
>>> + if ( (is_64bit_domain(current->domain) && !psr_mode_is_32bit(regs)) )
>>> + {
>>> + return decode_32bit_loadstore_postindexing(regs->pc, dabt, instr);
>>> + }
>>> +
>>> /* TODO: Handle ARM instruction */
>>> gprintk(XENLOG_ERR, "unhandled ARM instruction\n");
>>> diff --git a/xen/arch/arm/decode.h b/xen/arch/arm/decode.h
>>> index 4613763bdb..d82fc4a0f6 100644
>>> --- a/xen/arch/arm/decode.h
>>> +++ b/xen/arch/arm/decode.h
>>> @@ -35,7 +35,8 @@
>>> */
>>> int decode_instruction(const struct cpu_user_regs *regs,
>>> - struct hsr_dabt *dabt);
>>> + struct hsr_dabt *dabt,
>>> + union ldr_str_instr_class *instr);
>>> #endif /* __ARCH_ARM_DECODE_H_ */
>>> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
>>> index 729287e37c..0d60754bc4 100644
>>> --- a/xen/arch/arm/io.c
>>> +++ b/xen/arch/arm/io.c
>>> @@ -65,6 +65,16 @@ static enum io_state handle_write(const struct
>>> mmio_handler *handler, return ret ? IO_HANDLED : IO_ABORT;
>>> }
>>> +static void post_incremenet_register(union ldr_str_instr_class *instr)
>>> +{
>>> + struct cpu_user_regs *regs = guest_cpu_user_regs();
>>> + unsigned int val;
>>> +
>>> + val = get_user_reg(regs, instr->code.rn);
>>> + val += instr->code.imm9;
>>> + set_user_reg(regs, instr->code.rn, val);
>>> +}
>>> +
>>> /* This function assumes that mmio regions are not overlapped */
>>> static int cmp_mmio_handler(const void *key, const void *elem)
>>> {
>>> @@ -106,14 +116,26 @@ enum io_state try_handle_mmio(struct cpu_user_regs
>>> *regs, .gpa = gpa,
>>> .dabt = dabt
>>> };
>>> + int rc;
>>> + union ldr_str_instr_class instr = {0};
>>> ASSERT(hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
>>> + /*
>>> + * Armv8 processor does not provide a valid syndrome for post-indexing
>>> + * ldr/str instructions. So in order to process these instructions,
>>> + * Xen must decode them.
>>> + */
>>> + if ( !info.dabt.valid )
>>> + {
>>> + rc = decode_instruction(regs, &info.dabt, &instr);
>>> + if ( rc )
>>> + return IO_ABORT;
>>> + }
>>> +
>>> handler = find_mmio_handler(v->domain, info.gpa);
>>> if ( !handler )
>>> {
>>> - int rc;
>>> -
>>> rc = try_fwd_ioserv(regs, v, &info);
>>> if ( rc == IO_HANDLED )
>>> return handle_ioserv(regs, v);
>>> @@ -122,7 +144,7 @@ enum io_state try_handle_mmio(struct cpu_user_regs
>>> *regs, }
>>> /* All the instructions used on emulated MMIO region should be
>>> valid */
>>> - if ( !dabt.valid )
>>> + if ( !info.dabt.valid )
>>> return IO_ABORT;
>>> /*
>>> @@ -134,7 +156,7 @@ enum io_state try_handle_mmio(struct cpu_user_regs
>>> *regs, {
>>> int rc;
>>> - rc = decode_instruction(regs, &info.dabt);
>>> + rc = decode_instruction(regs, &info.dabt, NULL);
>>> if ( rc )
>>> {
>>> gprintk(XENLOG_DEBUG, "Unable to decode instruction\n");
>>> @@ -143,9 +165,15 @@ enum io_state try_handle_mmio(struct cpu_user_regs
>>> *regs, }
>>> if ( info.dabt.write )
>>> - return handle_write(handler, v, &info);
>>> + rc = handle_write(handler, v, &info);
>>> else
>>> - return handle_read(handler, v, &info);
>>> + rc = handle_read(handler, v, &info);
>>> +
>>> + if ( instr.value != 0 )
>>> + {
>>> + post_incremenet_register(&instr);
>>> + }
>>> + return rc;
>>> }
>>> void register_mmio_handler(struct domain *d,
>>> diff --git a/xen/include/asm-arm/hsr.h b/xen/include/asm-arm/hsr.h
>>> index 9b91b28c48..72d67d2801 100644
>>> --- a/xen/include/asm-arm/hsr.h
>>> +++ b/xen/include/asm-arm/hsr.h
>>> @@ -15,6 +15,32 @@ enum dabt_size {
>>> DABT_DOUBLE_WORD = 3,
>>> };
>>> +/*
>>> + * Refer to the ARMv8 ARM (DDI 0487G.b), Section C4.1.4 Loads and Stores
>>> + * Page 318 specifies the following bit pattern for
>>> + * "load/store register (immediate post-indexed)".
>>> + *
>>> + * 31 30 29 27 26 25 23 21 20 11 9 4 0
>>> + * ___________________________________________________________________
>>> + * |size|1 1 1 |V |0 0 |opc |0 | imm9 |0 1 | Rn | Rt |
>>> + * |____|______|__|____|____|__|_______________|____|_________|_______|
>>> + */
>>> +union ldr_str_instr_class {
>>> + uint32_t value;
>>> + struct ldr_str {
>>> + unsigned int rt:5; /* Rt register */
>> I don't think it's a particular good idea to use a bit-field here, if that
>> is expected to mimic a certain hardware provided bit pattern.
>> It works in practise (TM), but the C standard does not guarantee the order
>> the bits are allocated (ISO/IEC 9899:201x §6.7.2.1, stanza 11).
>> Since you are *reading* only from the instruction word, you should get away
>> with accessor macros to extract the bits you need. For instance for
>> filtering the opcode, you could use: ((insn & 0x3fe00c00) == 0x38400400)
>
> Yes, this is a very good point. I will use bitmasks to access the bits from
> the register.
>
> I saw the same logic (ie using bitfields) is used for some other registers as
> well. For eg hsr_dabt, hsr_iabt in xen/include/asm-arm/hsr.h. May be that
> needs fixing as well for some other time. :)
>
> - Ayan
>> Cheers,
>> Andre
>>> + unsigned int rn:5; /* Rn register */
>>> + unsigned int fixed1:2; /* value == 01b */
>>> + int imm9:9; /* imm9 */
>>> + unsigned int fixed2:1; /* value == 0b */
>>> + unsigned int opc:2; /* opc */
>>> + unsigned int fixed3:2; /* value == 00b */
>>> + unsigned int v:1; /* vector */
>>> + unsigned int fixed4:3; /* value == 111b */
>>> + unsigned int size:2; /* size */
>>> + } code;
>>> +};
>>> +
>>> union hsr {
>>> register_t bits;
>>> struct {
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |