[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH V4 1/3] xen: Introduce "gpaddr_bits" field to XEN_SYSCTL_physinfo
- To: Julien Grall <julien.grall.oss@xxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
- From: Oleksandr <olekstysh@xxxxxxxxx>
- Date: Sat, 2 Oct 2021 17:08:14 +0300
- Cc: Jan Beulich <jbeulich@xxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
- Delivery-date: Sat, 02 Oct 2021 14:08:34 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On 02.10.21 10:35, Julien Grall wrote:
Hi Julien, Stefano.
Thank you for your comments!
Hi
Bertrand,
see comment on ID_AA64MMFR0_EL1 below, any ideas?
On Fri, 1 Oct 2021, Oleksandr wrote:
> On 01.10.21 10:50, Jan Beulich wrote:
> > On 01.10.2021 01:00, Stefano Stabellini wrote:
> > > On Thu, 30 Sep 2021, Oleksandr Tyshchenko
wrote:
> > > > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
> > > >
> > > > We need to pass info about maximum
supported guest address
> > > > space size to the toolstack on Arm in
order to properly
> > > > calculate the base and size of the
extended region (safe range)
> > > > for the guest. The extended region is
unused address space which
> > > > could be safely used by domain for
foreign/grant mappings on Arm.
> > > > The extended region itself will be
handled by the subsequents
> > > > patch.
> > > >
> > > > Use p2m_ipa_bits variable on Arm, the
x86 equivalent is
> > > > hap_paddr_bits.
> > > >
> > > > As we change the size of structure
bump the interface version.
> > > >
> > > > Suggested-by: Julien Grall <jgrall@xxxxxxxxxx>
> > > > Signed-off-by: Oleksandr Tyshchenko
<oleksandr_tyshchenko@xxxxxxxx>
> > > > Reviewed-by: Michal Orzel <michal.orzel@xxxxxxx>
> > > Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> > I have to admit that I'm a little puzzled to see
these R-b-s when ...
> >
> > > > Please note, that review comments for
the RFC version [1] haven't been
> > > > addressed yet.
> > > > It is not forgotten, some
clarification is needed. It will be addressed
> > > > for the next version.
> > > >
> > > > [1]
> > > > https://lore.kernel.org/xen-devel/973f5344-aa10-3ad6-ff02-ad5f358ad279@xxxxxxxxxx/
> > ... Oleksandr makes clear this patch isn't
really ready yet.
>
> Unfortunately, this is true. I am still waiting for
the clarification [1]
Although I was aware of comments to older versions, this
is actually the
first version of this patch that I reviewed with any level
of details; I
didn't read previous comments very closely. I tried to
find any bugs or
problems with it and I couldn't see any, so I gave my
reviewed-by. I
should have clarified that was meant for the ARM part as I
don't have a
full understanding of the implications of using
hap_paddr_bits on x86
for VM migration.
But let me take this opportunity to say that although I
think the
hypercall is OK, I wish we didn't need this patch at all:
it is
problematic because it touches tools, x86 and ARM
hypervisor code all
together. It needs at least three acks/reviewed-by to get
accepted: from
an x86 maintainer, an arm maintainer and from a tools
maintainer. I
don't say this to criticize the patch acceptance process:
this patch
makes changes to an existing hypercall so it is only fair
that it needs
to go through extra levels of scrutiny. For the sake of
simplicity and
decoupling (reducing dependencies between patches and
between
components), I think it would be best to introduce an
#define for the
minimum value of gpaddr_bits and then move this patch at
the end of the
series; that way it becomes optional.
It depends what you mean by optional. Yes we can
add hack to avoid the hypercall... But the more scalable
solution is the hypercall.
I am slightly concerned that if we don't push
for the hypercall now, then there will be no incentive to do
it afterwards...
So I went through Andrew's e-mail to understand
what's the request. I understand that there are some problem
with migration. But it doesn't look like we need to solve them
now. Instead, AFAICT, his main ask for this series is to
switch to a domctl.
It seems the conversation is simply stuck on
waiting for Andrew to provide details on what would look like.
Did we ping Andrew on IRC?
Unfortunately the minimum value
is 32 (in practice I have never seen less than 40 but the
architecture
supports 32 as minimum).
Actually, the info we are looking for is already exposed
via
ID_AA64MMFR0_EL1. ID_AA64MMFR0_EL1 can be read from a
virtual machine,
and Linux let userspace read it [1]. Regardless of this
patch series, we
should make sure that Xen exposes the right mm64.pa_range
value to guest
virtual machines. If that is done right, then you can just
add support
for reading ID_AA64MMFR0_EL1 in libxl/libxc and then we
don't need any
hypercall modifications changes.
From my understanding, from a VM PoV "pa_range"
should represent the size of the guest physical address space.
Today, it happens that every VM is using the
same P2M size. However, I would rather not make such
assumption in the userspace.
So, in theory we already have all the interfaces we need,
but in
practice they don't work: unfortunaly both Xen and Linux
mark
ID_AA64MMFR0_EL1 as FTR_HIDDEN in cpufeature.c so neither
Linux from
Xen, not userspace from Linux can actually read the real
value :-/
They always read zero.
(Also I think we have an issue today with
p2m_restrict_ipa_bits not
updating the mm64.pa_range value. I think that it should
be fixed.)
It looks like it. That should be handled in a
separate patch though.
Bertrand, do you have any ideas in regards to
ID_AA64MMFR0_EL1?
If not, maybe we could just go with
#define MIN_GPADDR_BITS 32.
The toolstack would have to consider it as the
"maximum" because it may not be safe to expose anything above.
With 32, we are going to be limited in term of
space we can find.
We could potentially use 40 bits as a minimum.
Although it still feels a bit of a hack to me given that the
IOMMU may restrict it further and the architecture can in
theory support less.
Overall, I still strongly prefer the hypercall
approach. If a common one is difficult to achieve, then we can
extend the domctl to create a domain to provide the p2m_bits
(in the same way as we deal for the GIC version) in an arch
specific way.
To summarize:
If we don't query the hypervisor to provide gpaddr_bits we have two
options:
- The safe option is to use minimum possible value which is 32 bits
on Arm64. But, there would be of no practical use.
- The unsafe option is to use let's say "default" 40 bits and pray
it will work in all cases on Arm64 (it is ok on Arm32).
So we definitely need to query the hypervisor. As it turned out the
sysctl approach is not welcome, in the long term we want to have
this information per domain. I have been absolutely OK with that
valid ask since RFC, I just wanted to know what was the preferred
way to do this (new domctl, existing, etc)...
I analyzed what Julien had suggested regarding pass gpaddr_bits via
Arm's struct xen_arch_domainconfig (I assume, this should be an OUT
parameter) and I think it makes sense. Taking into the account that
the feature freeze date is coming, I will wait a few days, and if
there are no objections I will send updated version (patch #3 also
needs updating as it expects the gpaddr_bits to be in physinfo).
--
Regards,
Oleksandr Tyshchenko
|