[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

hardware aes issue related to xen cache coloring


  • To: xen-devel@xxxxxxxxxxxxx
  • From: Oleg Nikitenko <oleshiiwood@xxxxxxxxx>
  • Date: Thu, 29 Jun 2023 16:21:25 +0300
  • Delivery-date: Thu, 29 Jun 2023 13:15:16 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hello guys,

Be careful. Futher on will be long read...

Still continue to fight. What I found out ?
I built xen with cache coloring for Xilinx zyncmp.
In xen we are enabled harware encryption for some necessities.
Actually there one only change in xen/arch/arm/platforms/xilinx-eemi.c
at bool xilinx_eemi(struct cpu_user_regs *regs, const uint32_t fid,...)
function to the function id switch/case handler.
It only passes through our request to hardware aes core. It looks like

/* userspace AES operations need to pass */
   case 0xC200002F:
    gprintk(XENLOG_DEBUG, "Forwarding AES operation: %u\n", fid);
        goto forward_to_fw;

After flash it maybe worked some time. Unpredictaible it my got a hardware fault.
These may be seen as hardware fault in pmu/Microblaze

(XEN) d0v0 Forwarding AES operation: 3254779951
Received exception
MSR: 0x200, EAR: 0x19, EDR: 0x0, ESR: 0x861

These strings printed from pmu firmware
lib/sw_apps/zynqmp_pmufw/src/xpfw_core.c at void XPfw_Exception_Handler(void) function

or as assert somewhere at DMA functions like

Assert occurred from file xsecure_aes.c at line 410
Assert occurred from file xcsudma.c at line 140
Assert occurred from file xcsudma.c at line 143

Before than I have seen a lot of messages related to errors in aes requests like

[ 188.737910] zynqmp_aes firmware:zynqmp-firmware:zynqmp-aes: ERROR: Gcm Tag mismatch
(XEN) d0v0 Forwarding AES operation: 3254779951

[ 188.748496] zynqmp_aes firmware:zynqmp-firmware:zynqmp-aes: ERROR : Non word aligned data
(XEN) d0v0 Forwarding AES operation: 3254779951

[ 198.826279] zynqmp_aes firmware:zynqmp-firmware:zynqmp-aes: ERROR : Non word aligned data
(XEN) d0v0 Forwarding AES operation: 3254779951

[ 198.837363] zynqmp_aes firmware:zynqmp-firmware:zynqmp-aes: ERROR: Invalid
(XEN) d0v0 Forwarding AES operation: 3254779951

A string marked by (XEN) tag printed from above mentioned plase at xen.
All other strings printed from the Dom0 kernel. In particular this is the file
drivers/crypto/xilinx/zynqmp-aes.c at  
static int zynqmp_aes_xcrypt(struct skcipher_request *req, unsigned int flags) function.

An aes request structure contains 7 x64 bits data. I pointed it out from the Dom0 kernel's point of view
struct zynqmp_aes_data {
u64 src;
u64 iv;
u64 key;
u64 dst;
u64 size;
u64 optype;
u64 keysrc;
};
You may see it at drivers/crypto/xilinx/zynqmp-aes.c xilinx's kernel file. A memory for the request
allocated by driver from related DMA area.

So I found out a path of aes request.
ui application -> Dom0 kernel -> xen -> Security Monitor
A last one contains ARM Trusted Firmware/ATF
and Microblaze separate chip which executes request handlers.
Particularly from the function block point of view it contains
Platform Management Unit/PMU
Configuration Security Unit/CSU
They are executed xilsecure and other libraries.

Dom0 kernel invokes xen by hvc instruction. A hypervisor invokes Security Monitor by smc instruction.
Further on request get into ARM Trusted Firmware in concrete plat/xilinx/zynqmp/pm_service/pm_api_sys.c
at pm_ret_status pm_aes_engine(...) function. It sends this request to PMU by Inter Processor Interrupt/IPI.
These errors have never arrived when hypervisor built without chache coloring.
Then I dumped these requests from DomU kernel and PMU side in both cases. I got a lower mentioned picture.
With cache coloring
==== Dom0 kernel part here ====
[   27.263734] zynqmp_aes [0] ffffffc00912d000 1194c000 firmware:zynqmp-firmware:zynqmp-aes
[   27.266314] zynqmp_aes [1] ffffffc009135000 1194e000
[   27.271331] zynqmp_aes [2] dump request align 1 ++
[   27.276143] 00 c0 94 11 00 00 00 00
[   27.279688] 50 c0 94 11 00 00 00 00
[   27.283234] 00 00 00 00 00 00 00 00
[   27.286780] 00 c0 94 11 00 00 00 00
[   27.290327] 40 00 00 00 00 00 00 00
[   27.293874] 00 00 00 00 00 00 00 00
[   27.297420] 01 00 00 00 00 00 00 00
[   27.300967] zynqmp_aes [3] dump request --

========== xypervisor part here ==========
(XEN) d0v1 Forwarding AES operation: 3254779951

==== PMU part here ====
01194E000 agn 1
1E A7 D1 B1 35 22 7B 1F
AE 84 8F 56 99 03 80 3F
15 49 E7 F3 DE C9 E1 17
FB C7 7C 16 CF 58 DF A1
AF CF DC 07 F9 55 49 3F
E0 D9 35 89 50 81 FA AE
87 B6 29 16 96 F6 5F F2

==== ATF part here, i printed just after the request ====
INFO:   pm_aes_engine ### args 6 ret 0 addr 0 1194e000 ###

==== Back to Dom0 kernel ====
[   27.336699] zynqmp_aes firmware:zynqmp-firmware:zynqmp-aes: ERROR : Non word aligned data

Without cache coloring
==== Dom0 kernel part here ====
[   16.746389] zynqmp_aes [0] ffffffc00910d000 5ecfd000 firmware:zynqmp-firmware:zynqmp-aes
[   16.751548] zynqmp_aes [1] ffffffc009115000 5ed1e000
[   16.756557] zynqmp_aes [2] dump request align 1 ++
[   16.761400] 00 d0 cf 5e 00 00 00 00
[   16.764944] 50 d0 cf 5e 00 00 00 00
[   16.768490] 00 00 00 00 00 00 00 00
[   16.772037] 00 d0 cf 5e 00 00 00 00
[   16.775582] 40 00 00 00 00 00 00 00
[   16.779130] 00 00 00 00 00 00 00 00
[   16.782676] 01 00 00 00 00 00 00 00
[   16.786231] zynqmp_aes [3] dump request --

========== xypervisor part here ==========
(XEN) d0v0 Forwarding AES operation: 3254779951

==== PMU part here ====
05ED1E000 agn 1
00 D0 CF 5E 00 00 00 00
50 D0 CF 5E 00 00 00 00
00 00 00 00 00 00 00 00
00 D0 CF 5E 00 00 00 00
40 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00

==== ATF part here, i printed just after the request ====
INFO:   pm_aes_engine ### args 6 ret 0 addr 0 5ed1e000 ###

==== Back to Dom0 kernel ====
[   16.821959] zynqmp_pm_aes_engine addr 5ed1e000 ret 0

As we may see here with the xen cache coloring dumps of request in both sides are equal.
Without the xen cache coloring dumps of request in both sides are different.
So I got a guess.
According to Stefano's comments the difference between both modes is in SMMU.
Without the cache coloring all the memory mapping does programmaticaly by xypervisor.
With the xen cache coloring all the memory mapping does by SMMU.
This is the point to my mind. SMMU does not know anything about the DMA between Dom0 and PMU.
This is the point.
Could someone suggest me a way how to fix it ?
Thanks for patience.

Regards
Oleg

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.