[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V4 1/3] xen: Introduce "gpaddr_bits" field to XEN_SYSCTL_physinfo




On 02.10.21 10:35, Julien Grall wrote:

Hi Julien, Stefano.


Thank you for your comments!

Hi

On Sat, 2 Oct 2021, 01:24 Stefano Stabellini, <sstabellini@xxxxxxxxxx> wrote:
Bertrand, see comment on ID_AA64MMFR0_EL1 below, any ideas?


On Fri, 1 Oct 2021, Oleksandr wrote:
> On 01.10.21 10:50, Jan Beulich wrote:
> > On 01.10.2021 01:00, Stefano Stabellini wrote:
> > > On Thu, 30 Sep 2021, Oleksandr Tyshchenko wrote:
> > > > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
> > > >
> > > > We need to pass info about maximum supported guest address
> > > > space size to the toolstack on Arm in order to properly
> > > > calculate the base and size of the extended region (safe range)
> > > > for the guest. The extended region is unused address space which
> > > > could be safely used by domain for foreign/grant mappings on Arm.
> > > > The extended region itself will be handled by the subsequents
> > > > patch.
> > > >
> > > > Use p2m_ipa_bits variable on Arm, the x86 equivalent is
> > > > hap_paddr_bits.
> > > >
> > > > As we change the size of structure bump the interface version.
> > > >
> > > > Suggested-by: Julien Grall <jgrall@xxxxxxxxxx>
> > > > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
> > > > Reviewed-by: Michal Orzel <michal.orzel@xxxxxxx>
> > > Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> > I have to admit that I'm a little puzzled to see these R-b-s when ...
> >
> > > > Please note, that review comments for the RFC version [1] haven't been
> > > > addressed yet.
> > > > It is not forgotten, some clarification is needed. It will be addressed
> > > > for the next version.
> > > >
> > > > [1]
> > > > https://lore.kernel.org/xen-devel/973f5344-aa10-3ad6-ff02-ad5f358ad279@xxxxxxxxxx/
> > ... Oleksandr makes clear this patch isn't really ready yet.
>
> Unfortunately, this is true. I am still waiting for the clarification [1]

Although I was aware of comments to older versions, this is actually the
first version of this patch that I reviewed with any level of details; I
didn't read previous comments very closely. I tried to find any bugs or
problems with it and I couldn't see any, so I gave my reviewed-by. I
should have clarified that was meant for the ARM part as I don't have a
full understanding of the implications of using hap_paddr_bits on x86
for VM migration.


But let me take this opportunity to say that although I think the
hypercall is OK, I wish we didn't need this patch at all: it is
problematic because it touches tools, x86 and ARM hypervisor code all
together. It needs at least three acks/reviewed-by to get accepted: from
an x86 maintainer, an arm maintainer and from a tools maintainer. I
don't say this to criticize the patch acceptance process: this patch
makes changes to an existing hypercall so it is only fair that it needs
to go through extra levels of scrutiny. For the sake of simplicity and
decoupling (reducing dependencies between patches and between
components), I think it would be best to introduce an #define for the
minimum value of gpaddr_bits and then move this patch at the end of the
series; that way it becomes optional.

It depends what you mean by optional. Yes we can add hack to avoid the hypercall... But the more scalable solution is the hypercall.

I am slightly concerned that if we don't push for the hypercall now, then there will be no incentive to do it afterwards...

So I went through Andrew's e-mail to understand what's the request. I understand that there are some problem with migration. But it doesn't look like we need to solve them now. Instead,  AFAICT, his main ask for this series is to switch to a domctl.

It seems the conversation is simply stuck on waiting for Andrew to provide details on what would look like. Did we ping Andrew on IRC?

Unfortunately the minimum value
is 32 (in practice I have never seen less than 40 but the architecture
supports 32 as minimum).


Actually, the info we are looking for is already exposed via
ID_AA64MMFR0_EL1. ID_AA64MMFR0_EL1 can be read from a virtual machine,
and Linux let userspace read it [1]. Regardless of this patch series, we
should make sure that Xen exposes the right mm64.pa_range value to guest
virtual machines. If that is done right, then you can just add support
for reading ID_AA64MMFR0_EL1 in libxl/libxc and then we don't need any
hypercall modifications changes.

From my understanding, from a VM PoV "pa_range" should represent the size of the guest physical address space.

Today, it happens that every VM is using the same P2M size. However, I would rather not make such assumption in the userspace.


So, in theory we already have all the interfaces we need, but in
practice they don't work: unfortunaly both Xen and Linux mark
ID_AA64MMFR0_EL1 as FTR_HIDDEN in cpufeature.c so neither Linux from
Xen, not userspace from Linux can actually read the real value :-/
They always read zero.

(Also I think we have an issue today with p2m_restrict_ipa_bits not
updating the mm64.pa_range value. I think that it should be fixed.)

It looks like it. That should be handled in a separate patch though.


Bertrand, do you have any ideas in regards to ID_AA64MMFR0_EL1?

If not, maybe we could just go with
#define MIN_GPADDR_BITS 32.

The toolstack would have to consider it as the "maximum" because it may not be safe to expose anything above.

With 32, we are going to be limited in term of space we can find.

We could potentially use 40 bits as a minimum. Although it still feels a bit of a hack to me given that the IOMMU may restrict it further and the architecture can in theory support less.

Overall, I still strongly prefer the hypercall approach. If a common one is difficult to achieve, then we can extend the domctl to create a domain to provide the p2m_bits (in the same way as we deal for the GIC version) in an arch specific way.


To summarize:
If we don't query the hypervisor to provide gpaddr_bits we have two options:
- The safe option is to use minimum possible value which is 32 bits on Arm64. But, there would be of no practical use.
- The unsafe option is to use let's say "default" 40 bits and pray it will work in all cases on Arm64 (it is ok on Arm32).

So we definitely need to query the hypervisor. As it turned out the sysctl approach is not welcome, in the long term we want to have this information per domain. I have been absolutely OK with that valid ask since RFC, I just wanted to know what was the preferred way to do this (new domctl, existing, etc)...

I analyzed what Julien had suggested regarding pass gpaddr_bits via Arm's struct xen_arch_domainconfig (I assume, this should be an OUT parameter) and I think it makes sense. Taking into the account that the feature freeze date is coming, I will wait a few days, and if there are no objections I will send updated version (patch #3 also needs updating as it expects the gpaddr_bits to be in physinfo).


-- 
Regards,

Oleksandr Tyshchenko

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.