[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 5/6] xen/arm: smccc: Add wrapper to automatically select the calling convention
Hi Julien, On 28.08.18 17:05, Julien Grall wrote: Hi Volodymyr, On 27/08/18 17:50, Volodymyr Babchuk wrote:On 27.08.18 18:29, Julien Grall wrote:On 27/08/2018 15:15, Volodymyr Babchuk wrote:This is indeed increasing the size of the function, but with a better performance when you perform an SMC with 1.1.On 24.08.18 19:58, Julien Grall wrote:Signed-off-by: Julien Grall <julien.grall@xxxxxxx>The main goal of SMCCC 1.1 is to limit the number of registers saved when calling an SMC. So if you provide provide a wrapper using a function, then you are better off sticking with SMCCC 1.0.The idea of this code is to provide a faster call on the presence of SMCCC 1.1 while providing a fallback for other case. The function cpus_have_const_cap is implemented using an ALTERNATIVE (similar to static key) that will make if close to a NOP. I am saying close because this is not quite a static key as we don't have it in Xen. Instead, you replace a memory load by a constant.Ah, yes, true.I checked how static keys are working. Seems, they use asm goto feature. Now I think: can we optimize this more? Something like that:#define arm_smccc_smc(...) do { asm goto (ALTERNATIVE(nop", "b %l[smccc_1_1]", ARM_SMCCC_1_1));You would need to list the label in GotoLabels (see 6.45.2.7 [1]). Yes, but as you said, we can use __LINE__ there. The label will get redefined if you have multiple SMC call in the same function. We could possibly generate label based on the file/line.arm_smccc_1_0_smc(__VA_ARGS__); break; smccc_1_1:arm_smccc_1_1_smc(__VA_ARGS__); } while ( 0 ) This will save use additional conditional branch.The code with this patch looks like: mov x0, #0 mov x0, #1 cbz w0, label cbz w0, labelThe mov + conditional branch is likely to be negligible over the cost of switching to EL3. Overall, I am not really convinced that it would be worth the cost of open-coding it. I would prefer if we keep the call to cpus_have_const_cap() and see if it can be optimized. Okay. I'm fine with this. I just saw one more way to optimize and wanted to share it with you. Anyways: Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx> I have looked at cpus_have_const_cap() and haven't found good way to optimize it with the current infrastructure in Xen. Feel free to suggest improvement. Another thing: maybe it is worth to branch to 1.0 code and leave 1.1 in a straight path of execution? This will save you one more instruction for SMCCC 1.1 call. -- Volodymyr Babchuk _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |