[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/common: Do not allocate magic pages 1:1 for direct mapped domains


  • To: Julien Grall <julien@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Henry Wang <xin.wang2@xxxxxxx>
  • Date: Wed, 28 Feb 2024 20:50:08 +0800
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=xen.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uybtuNqwicTm0XbF9JMzhofiQJwwSLgnTfRwNgDzpwk=; b=PhzZBH9/c8yE3VslDgr7mV25YmBZ7X9XrIjiP1GA/yHelf2fQHwpkpCiCnojbcNJl/vgxuPQAxivpldfulb5b/unCkR9WDvJtSAbWeoN6f6U1Fnaf75uWOn51oM6vxc4W0h2j7rFw98o6PMa5ubFMJHQhuXTYH/V+DPDfwpPdB8QoPliCyTDmz7wY6cbGzVBxAvFK/+jcX8k/71WXHj1xJ8fAgkSFUQEkY9kXf6032/fQgy+QhWiabqRCZsswyVLpL707eZ0lZ4X+FnkuGD3UZ/A9628zfyT6l2STqv2lQ/N+QvedLpArr6YSoCYv/ClskeSXaMZCGjWvNSSZUtPlw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Sx1AL4LBO1GwZJBx6cakI37KTIJSuD3QJhwB7yfzKUOfSeOOgugBdPrLV2R+3PJ6hMCC/5f/QKQQ6eekEw8VPfSLaKWbAZKEO6Lq1FKtCTWXWTgFT25R4BafUKizW2blLBJvLuP1FHnRn57ogfAykTy/tRoeMUhyhGQcO6RyVZwA4HxVjAf2f6G56vnkDNHwjoNBzcHMr4FkNGyrJUTXMLoFWz8ezGr0JqU7obx0WDy93/wlDB7AncRUXSZxTr+C72et3nZTMh+7gABQf+6HpYTkj7ftLJbscB4QM9P4JXUu+9lJCYmHlwKpL7JEYx0JisTFta/V0XHh8I0O0J+u3w==
  • Cc: Anthony PERARD <anthony.perard@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, "Stefano Stabellini" <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, "Volodymyr Babchuk" <Volodymyr_Babchuk@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Alec Kwapis <alec.kwapis@xxxxxxxxxxxxx>
  • Delivery-date: Wed, 28 Feb 2024 12:50:28 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Julien,

On 2/28/2024 8:27 PM, Julien Grall wrote:
Hi Henry,
After checking the code flow, below rough plan came to my mind, I think what we need to do is:

(1) Find a range of un-used memory using similar method in find_unallocated_memory()/find_domU_holes()

AFAIK, the toolstack doesn't have any knowledge of the memeory layout for dom0less domUs today. We would need to expose it first.

If I understand correctly, I think the issue you mentioned here and ...

Then the region could either be statically allocated (i.e. the admin provides it in the DTB) or dynamically.

(2) Change the base address, i.e. GUEST_MAGIC_BASE in alloc_xs_page() in init-dom0less.c to point to the address in (1) if static mem or 11 directmap. (I think this is a bit tricky though, do you have any method that in your mind?)

AFAIK, the toolstack doesn't know whether a domain is direct mapped or using static mem.

...here basically means we want to do the finding of the unused region in toolstack. Since currently what we care about is only a couple of pages instead of the whole memory map, could it be possible that we do the opposite: in alloc_xs_page(), we issue a domctl hypercall to Xen and do the finding work in Xen and return with the found gfn? Then the page can be mapped by populate_physmap() from alloc_xs_page() and used for XenStore.

I know that DOMCTL hypercalls are not stable. But I am not overly happy with creating an hypercall which is just "fetch the magic regions". I think it need to be a more general one that would expose all the regions.

Also, you can't really find the magic regions when the hypercall is done as we don't have the guest memory layout. This will need to be done in advance.

Overall, I think it would be better if we provide an hypercall listing the regions currently occupied (similar to e820). One will have the type "magic pages".

Yeah now it is more clear. I agree your approach is indeed a lot better. I will check how e820 works and see if I can do something similar. Also it might not be related, I think we somehow had a similar discussion in [1] when I do static heap.

If above approach makes sense to you, I have a further question: Since I understand that the extended region is basically for safely foreign mapping pages

This is not about safety. The extended region is optional. It was introduced so it is easy for Linux to find an unallocated region to map external pages (e.g. vCPU shared info) so it doesn't waste RAM pages.

, and init_dom0less.c uses foreign memory map for this
XenStore page, should we find the wanted page in the extended region? or even extended region should be excluded?

How is the extended region found for dom0less domUs today?

The extended regions for dom0less domUs are found by function find_domU_holes() introduced by commit 2a24477 xen/arm: implement domU extended regions. I think this commit basically followed the original algorithm introduced by commit 57f8785 libxl/arm: Add handling of extended regions for DomU.

It would be fine to steal some part of the extended regions for the magic pages. But they would need to be reserved *before* the guest is first unpaused.

I also thought this today, as I think writing a function basically doing the same as what we do for extended regions is probably too much, so stealing some part of the extended region is easier. What I worry about is that: Once the extended regions are allocated and written to the "reg" property of the device tree hypervisor node, the data structures to record these extended regions are freed. So we kind of "lost" the information about these extended regions if we want to get them later (for example my use case). Also, if we still some part of memory from the extended regions, should we carve out the "stolen" parts from the device tree as well?

Also, why are you only checking the first GFN? What if the caller pass an overlapped region?

I am a bit confused. My understanding is at this point we are handling one page at a time.

We are handling one "extent" at the time. This could be one or multiple pages (see extent_order).

I agree, sorry I didn't express myself well. For this specific XenStore page, I think the extent_order is
fixed as 0 so there is only 1 page.

Correct. But you should not rely on this :).

Yeah definitely :)

[1] https://lore.kernel.org/xen-devel/e53601a1-a5ac-897a-334d-de45d96e9863@xxxxxxx/

Kind regards,
Henry




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.