[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 5/8] arm/vm_event: get/set registers
On 01/06/16 20:38, Julien Grall wrote: Hi Tamas, On 01/06/16 19:21, Tamas K Lengyel wrote:On Wed, Jun 1, 2016 at 5:24 AM, Julien Grall <julien.grall@xxxxxxx> wrote:Hi, On 01/06/16 09:41, Jan Beulich wrote:On 31.05.16 at 18:28, <tamas@xxxxxxxxxxxxx> wrote:On May 31, 2016 01:48, "Jan Beulich" <JBeulich@xxxxxxxx> wrote:On 30.05.16 at 21:47, <tamas@xxxxxxxxxxxxx> wrote:On Mon, May 30, 2016 at 5:50 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:On 30.05.16 at 00:37, <tamas@xxxxxxxxxxxxx> wrote:+struct vm_event_regs_arm32 { + uint32_t r0_usr; + uint32_t r1_usr; + uint32_t r2_usr; + uint32_t r3_usr; + uint32_t r4_usr; + uint32_t r5_usr; + uint32_t r6_usr; + uint32_t r7_usr; + uint32_t r8_usr; + uint32_t r9_usr; + uint32_t r10_usr; + uint32_t r12_usr; + uint32_t lr_usr; + uint32_t fp; + uint32_t pc; + uint32_t sp_usr; + uint32_t sp_svc; + uint32_t spsr_svc; +};It would seem more natural for the "ordinary" registers to be arranged in the numerical sequence, i.e. fp, r12, sp, lr, pc.Not sure I follow.For one it is quite natural for someone looking at a sequence of register values to assume / expect them to be in natural order. And then, having them in natural (numeric) order allows e.g. extracting the register identifying bits from an instruction to use them as an array index into (part of) this structure. (For some background: I've been bitten a number of times by people sorting x86 registers alphabetically instead or naturally, i.e. EAX, EBX, ECX, EDX instead of EAX, ECX, EDX, EBX).I see, however I believe that would be a very careless use of this struct from the user as the register sizes are not even necessarily match the architecture. For example we only define the 64bit x86 registers, so trying to access it as an array of 32bit registers wouldn't work at all. Plus we are not doing a full set of registers, and I rather not imply that every element in the "natural sequence" is present. It may be, it may be not.Once an ABI is set in stone, and if that ABI allows for optimizations (by consumers) like the one mentioned, I don't think this would be careless use. The resulting code is very clearly much more efficient than e.g. a switch() statement with a case label for each and every register. Well, yes, I already hear the "memory is cheap and hence code size doesn't matter" argument, but as said elsewhere quite recently I don't buy this.I agree with Jan here. ARM64 has 31 general purposes registers (x0-x30). The switch to find a register based on the index will be quite big. If you order the register and provide all the general purposes registers (x0 - x30), you will be able to replace by a single line (for instance see select_user_reg in arch/arm/traps.c).The issue is that the x86 vm_event struct right now has 32*uint64_t size. So if we would want to transmit all ARM64 GPRs + cpsr, PC and TTBR0/1 we would end up growing this structure beyond what it is currently. I want to avoid that as it affects both ARM32 and x86 introspection applications as well as now we can hold fewer events on the ring. I would say it is better to avoid that then potentially saving some on a switch in ARM64 introspection applications.The design choice made for x86 should not impact the ARM design (and vice-versa). There are key structures in the public interface which differ between x86 and ARM (see arch_vcpu_info and arch_shared_info). And this is fine because Xen is not meant to run an x86 guest on an ARM hypervisor. As far as I can tell, we currently support VM_EVENT_REASON_MEM_ACCESS and VM_EVENT_REASON_GUEST_REQUEST. So technically the structure is set in stone. However, we have an interface version that could be bumped, we can still decide what is the sensible choice. With your suggestions only a part of the general-purpose registers will be present in the vm_event. I understand that the ring has a limited size. If I counted correctly, the current size of the vm_event structure is 304 bytes. So 15 vm_event slots are available. If we grow the structure for ARM64, i.e 3 64-bits registers (PC, TTBR0, TTBR1) and 1 32-bit register (CPSR). Which mean 311 bytes, i.e 13 Sorry I miscalculated. So it should be: - 332 bytes - 12 slots vm_event slots. If the vm event subsystem is under pressure, I admit that 2 slots could be a lot. However, as not all the GPRs will be available in the structure you may have to fetch them, through XEN_DOMCTL_getvcpucontext I guess (?). The impact of the introspection to the guest will be significant. I cannot tell how often this will be the case, but I can tell you a compiler can do anything with the register allocation. I.e it could decide to privilege allocation in registers x20-x30 (because big number are nicer). If you are still concern about the pressure on the ring page, Razvan suggested to support multi-ring page. Regards, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |