[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] xen/memory: Introduce a hypercall to provide unallocated space





On 10/08/2021 12:58, Oleksandr wrote:

On 10.08.21 09:34, Wei Chen wrote:
Hi Oleksandr,

Hi Wei, Julien.
Hi,

-----Original Message-----
From: Oleksandr <olekstysh@xxxxxxxxx>
Sent: 2021年8月10日 2:25
To: Julien Grall <julien@xxxxxxx>
Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>; Andrew Cooper
<andrew.cooper3@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx; Oleksandr
Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>; Daniel De Graaf
<dgdegra@xxxxxxxxxxxxx>; Daniel P. Smith <dpsmith@xxxxxxxxxxxxxxxxxxxx>;
Ian Jackson <iwj@xxxxxxxxxxxxxx>; Wei Liu <wl@xxxxxxx>; George Dunlap
<george.dunlap@xxxxxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx>; Volodymyr
Babchuk <Volodymyr_Babchuk@xxxxxxxx>; Roger Pau Monné
<roger.pau@xxxxxxxxxx>; Bertrand Marquis <Bertrand.Marquis@xxxxxxx>; Wei
Chen <Wei.Chen@xxxxxxx>
Subject: Re: [RFC PATCH] xen/memory: Introduce a hypercall to provide
unallocated space


On 09.08.21 18:42, Julien Grall wrote:
Hi Oleksandr,

Hi Julien.


Thank you for the input.


On 07/08/2021 18:03, Oleksandr wrote:
On 06.08.21 03:30, Stefano Stabellini wrote:

Hi Stefano

On Wed, 4 Aug 2021, Julien Grall wrote:
+#define GUEST_SAFE_RANGE_BASE xen_mk_ullong(0xDE00000000) /*
128GB */
+#define GUEST_SAFE_RANGE_SIZE xen_mk_ullong(0x0200000000)

While the possible new DT bindings has not been agreed yet, I re-
used
existing "reg" property under the hypervisor node to pass safe
range as a
second region,
https://elixir.bootlin.com/linux/v5.14-
rc4/source/Documentation/devicetree/bindings/arm/xen.txt#L10:
So a single region works for a guest today, but for dom0 we will
need multiple
regions because it is may be difficult to find enough contiguous
space for a
single region.

That said, as dom0 is mapped 1:1 (including some guest mapping),
there is also
the question where to allocate the safe region. For grant table, we
so far
re-use the Xen address space because it is assumed it will space
will always
be bigger than the grant table.

I am not sure yet where we could allocate the safe regions.
Stefano, do you
have any ideas?
The safest choice would be the address range corresponding to memory
(/memory) not already allocated to Dom0.

For instance from my last boot logs:
(XEN) Allocating 1:1 mappings totalling 1600MB for dom0:
(XEN) BANK[0] 0x00000010000000-0x00000070000000 (1536MB)
(XEN) BANK[1] 0x00000078000000-0x0000007c000000 (64MB)

All the other ranges could be given as unallocated space:

- 0x0 - 0x10000000
- 0x70000000 - 0x78000000
- 0x8_0000_0000 - 0x8_8000_0000
Thank you for the ideas.

If I got the idea correctly, yes, as these ranges represent the real
RAM, so no I/O would be in conflict with them and as the result - no
overlaps would be expected.
But, I wonder, would this work if we have IOMMU enabled for Dom0 and
need to establish 1:1 mapping for the DMA devices to work with grant
mappings...
In arm_iommu_map_page() we call guest_physmap_add_entry() with gfn =
mfn, so the question is could we end up with this new gfn replacing
the valid mapping
(with gfn allocated from the safe region)?
Right, when we enable the IOMMU for dom0, Xen will add an extra
mapping with GFN == MFN for foreign and grant pages. This is because
Linux is not aware that whether a device is protected by an IOMMU.
Therefore it is assuming it is not and will use the MFN to configure
for DMA transaction.

We can't remove the mapping without significant changes in Linux and
Xen. I would not mandate them for this work.

That said, I think it would be acceptable to have different way to
find the region depending on the dom0 configuration. So we could use
the RAM not used by dom0 when the IOMMU is turned off.
OK


The second best choice would be an hole: an address range not used by
anybody else (no reg property) and also not even mappable by a bus
(not
covered by a ranges property). This is not the best choice because
there
can cases where physical resources appear afterwards.
Are you saying that the original device-tree doesn't even describe
them in any way (i.e. reserved...)?

Unfortunately, yes.
So the decision where the safe region is located will be done by Xen.
There is no involvement of the domain (it will discover the region
from the DT). Therefore, I don't think we need to think about
everything right now as we could adapt this is exact region is not
part of the stable ABI.

The hotplug is one I would defer because this is not supported (and
quite likely not working) in Xen upstream today.
Sounds reasonable.



Now regarding the case where dom0 is using the IOMMU. The assumption
is Xen will be able to figure out all the regions used from the
firmware table (ACPI or DT).

AFAIK, this assumption would be correct for DT. However, for ACPI, I
remember we were not able to find all the MMIOs region in Xen (see [1]
and [2]). So even this solution would not work for ACPI.

If I am not mistaken, we don't support IOMMU with ACPI yet. So we
could defer the problem to when this is going to be supported.
Sounds reasonable.


To summarize:

0. Skip ACPI case for now, implement for DT case

1. If IOMMU is enabled for Dom0 -> provide holes found in Host DT as
safe ranges

Does static allocation and direct mapping driver domain can be treated
as this case?
I am not sure I can answer this question correctly due to the limited knowledge of these features.

But, it feels to me that holes solution would work, at least I don't see why not.


I wonder, why these can't be treated as the case #2 (where we provide not assigned RAM), I also don't see why not, however there might be pitfalls with direct mapped driver domain. Julien, do you have any opinion on this?

So whether the memory is statically allocated or dynamically allocated should not matter here. For direct mapped domain, then they should be treated the same way as dom0.

By that I mean if the IOMMU is not enabled for the domain, then we can use thew unallocated RAM. Otherwise, we would need to find some holes.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.