[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for Porting Xen to Armv8-R64 - DraftA



On 01/03/2022 06:29, Wei Chen wrote:
Hi Julien,

Hi,

-----Original Message-----
From: Julien Grall <julien@xxxxxxx>
Sent: 2022年2月26日 4:12
To: Wei Chen <Wei.Chen@xxxxxxx>; Stefano Stabellini
<sstabellini@xxxxxxxxxx>
Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Bertrand Marquis
<Bertrand.Marquis@xxxxxxx>; Penny Zheng <Penny.Zheng@xxxxxxx>; Henry Wang
<Henry.Wang@xxxxxxx>; nd <nd@xxxxxxx>
Subject: Re: Proposal for Porting Xen to Armv8-R64 - DraftA

Hi Wei,

On 25/02/2022 10:48, Wei Chen wrote:
      Armv8-R64 can support max to 256 MPU regions. But that's just
theoretical.
      So we don't want to define `pr_t mpu_regions[256]`, this is a
memory
waste
      in most of time. So we decided to let the user specify through a
Kconfig
      option. `CONFIG_ARM_MPU_EL1_PROTECTION_REGIONS` default value can
be
`32`,
      it's a typical implementation on Armv8-R64. Users will recompile
Xen
when
      their platform changes. So when the MPU changes, respecifying the
MPU
      protection regions number will not cause additional problems.

I wonder if we could probe the number of MPU regions at runtime and
dynamically allocate the memory needed to store them in arch_vcpu.


We have considered to used a pr_t mpu_regions[0] in arch_vcpu. But it
seems
we will encounter some static allocated arch_vcpu problems and sizeof
issue.

Does it need to be embedded in arch_vcpu? If not, then we could allocate
memory outside and add a pointer in arch_vcpu.


We had thought to use a pointer in arch_vcpu instead of embedding mpu_regions
into arch_vcpu. But we noticed that arch_vcpu has a __cacheline_aligned
attribute, this may be because of arch_vcpu will be used very frequently
in some critical path. So if we use the pointer for mpu_regions, may cause
some cache miss in these critical path, for example, in context_swtich.

From my understanding, the idea behind ``cacheline_aligned`` is to avoid the struct vcpu to be shared with other datastructure. Otherwise you may end up to have two pCPUs to frequently write the same cacheline which is not ideal.

arch_vcpu should embbed anything that will be accessed often (e.g. entry/exit) to certain point. For instance, not everything related to the vGIC are embbed in the vCPU/Domain structure.

I am a bit split regarding the mpu_regions. If they are mainly used in the context_switch() then I would argue this is a premature optimization because the scheduling decision is probably going to take a lot more time than the context switch itself.

Note that for the P2M we already have that indirection because it is embbed in the struct domain.

This raises one question, why is the MPUs regions will be per-vCPU rather per domain?

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.