[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V4 1/3] xen: Introduce "gpaddr_bits" field to XEN_SYSCTL_physinfo

On 05.10.21 00:11, Stefano Stabellini wrote:

Hi Stefano

On Sat, 2 Oct 2021, Oleksandr wrote:
On 02.10.21 10:35, Julien Grall wrote:

Thank you for your comments!


       On Sat, 2 Oct 2021, 01:24 Stefano Stabellini, <sstabellini@xxxxxxxxxx> 
       Bertrand, see comment on ID_AA64MMFR0_EL1 below, any ideas?

       On Fri, 1 Oct 2021, Oleksandr wrote:
       > On 01.10.21 10:50, Jan Beulich wrote:
       > > On 01.10.2021 01:00, Stefano Stabellini wrote:
       > > > On Thu, 30 Sep 2021, Oleksandr Tyshchenko wrote:
       > > > > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
       > > > >
       > > > > We need to pass info about maximum supported guest address
       > > > > space size to the toolstack on Arm in order to properly
       > > > > calculate the base and size of the extended region (safe range)
       > > > > for the guest. The extended region is unused address space which
       > > > > could be safely used by domain for foreign/grant mappings on Arm.
       > > > > The extended region itself will be handled by the subsequents
       > > > > patch.
       > > > >
       > > > > Use p2m_ipa_bits variable on Arm, the x86 equivalent is
       > > > > hap_paddr_bits.
       > > > >
       > > > > As we change the size of structure bump the interface version.
       > > > >
       > > > > Suggested-by: Julien Grall <jgrall@xxxxxxxxxx>
       > > > > Signed-off-by: Oleksandr Tyshchenko 
       > > > > Reviewed-by: Michal Orzel <michal.orzel@xxxxxxx>
       > > > Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
       > > I have to admit that I'm a little puzzled to see these R-b-s when ...
       > >
       > > > > Please note, that review comments for the RFC version [1] 
haven't been
       > > > > addressed yet.
       > > > > It is not forgotten, some clarification is needed. It will be 
       > > > > for the next version.
       > > > >
       > > > > [1]
       > > > > 
       > > ... Oleksandr makes clear this patch isn't really ready yet.
       > Unfortunately, this is true. I am still waiting for the clarification 

       Although I was aware of comments to older versions, this is actually the
       first version of this patch that I reviewed with any level of details; I
       didn't read previous comments very closely. I tried to find any bugs or
       problems with it and I couldn't see any, so I gave my reviewed-by. I
       should have clarified that was meant for the ARM part as I don't have a
       full understanding of the implications of using hap_paddr_bits on x86
       for VM migration.

       But let me take this opportunity to say that although I think the
       hypercall is OK, I wish we didn't need this patch at all: it is
       problematic because it touches tools, x86 and ARM hypervisor code all
       together. It needs at least three acks/reviewed-by to get accepted: from
       an x86 maintainer, an arm maintainer and from a tools maintainer. I
       don't say this to criticize the patch acceptance process: this patch
       makes changes to an existing hypercall so it is only fair that it needs
       to go through extra levels of scrutiny. For the sake of simplicity and
       decoupling (reducing dependencies between patches and between
       components), I think it would be best to introduce an #define for the
       minimum value of gpaddr_bits and then move this patch at the end of the
       series; that way it becomes optional.

It depends what you mean by optional. Yes we can add hack to avoid the 
hypercall... But the more scalable solution is the hypercall.

I am slightly concerned that if we don't push for the hypercall now, then there 
will be no incentive to do it afterwards...

So I went through Andrew's e-mail to understand what's the request. I 
understand that there are some problem with migration. But it
doesn't look like we need to solve them now. Instead,  AFAICT, his main ask for 
this series is to switch to a domctl.

It seems the conversation is simply stuck on waiting for Andrew to provide 
details on what would look like. Did we ping Andrew on

       Unfortunately the minimum value
       is 32 (in practice I have never seen less than 40 but the architecture
       supports 32 as minimum).

       Actually, the info we are looking for is already exposed via
       ID_AA64MMFR0_EL1. ID_AA64MMFR0_EL1 can be read from a virtual machine,
       and Linux let userspace read it [1]. Regardless of this patch series, we
       should make sure that Xen exposes the right mm64.pa_range value to guest
       virtual machines. If that is done right, then you can just add support
       for reading ID_AA64MMFR0_EL1 in libxl/libxc and then we don't need any
       hypercall modifications changes.

 From my understanding, from a VM PoV "pa_range" should represent the size of 
the guest physical address space.

Today, it happens that every VM is using the same P2M size. However, I would 
rather not make such assumption in the userspace.

       So, in theory we already have all the interfaces we need, but in
       practice they don't work: unfortunaly both Xen and Linux mark
       ID_AA64MMFR0_EL1 as FTR_HIDDEN in cpufeature.c so neither Linux from
       Xen, not userspace from Linux can actually read the real value :-/
       They always read zero.

       (Also I think we have an issue today with p2m_restrict_ipa_bits not
       updating the mm64.pa_range value. I think that it should be fixed.)

It looks like it. That should be handled in a separate patch though.

       Bertrand, do you have any ideas in regards to ID_AA64MMFR0_EL1?

       If not, maybe we could just go with
       #define MIN_GPADDR_BITS 32.

The toolstack would have to consider it as the "maximum" because it may not be 
safe to expose anything above.

With 32, we are going to be limited in term of space we can find.

We could potentially use 40 bits as a minimum. Although it still feels a bit of 
a hack to me given that the IOMMU may restrict it
further and the architecture can in theory support less.

Overall, I still strongly prefer the hypercall approach. If a common one is 
difficult to achieve, then we can extend the domctl to
create a domain to provide the p2m_bits (in the same way as we deal for the GIC 
version) in an arch specific way.

To summarize:
If we don't query the hypervisor to provide gpaddr_bits we have two options:
- The safe option is to use minimum possible value which is 32 bits on Arm64. 
But, there would be of no practical use.
- The unsafe option is to use let's say "default" 40 bits and pray it will work 
in all cases on Arm64 (it is ok on Arm32).

So we definitely need to query the hypervisor. As it turned out the sysctl 
approach is not welcome, in the long term we want to have this
information per domain. I have been absolutely OK with that valid ask since 
RFC, I just wanted to know what was the preferred way to do
this (new domctl, existing, etc)...

I analyzed what Julien had suggested regarding pass gpaddr_bits via Arm's 
struct xen_arch_domainconfig (I assume, this should be an OUT
parameter) and I think it makes sense. Taking into the account that the feature 
freeze date is coming, I will wait a few days, and if there
are no objections I will send updated version (patch #3 also needs updating as 
it expects the gpaddr_bits to be in physinfo).

No objections from me, I think Julien's suggestion is a good one.

ok, great. I have already implemented/tested that.

Thank you.


Oleksandr Tyshchenko



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.