[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V5 2/3] libxl/arm: Add handling of extended regions for DomU



On Thu, 7 Oct 2021, Oleksandr wrote:
> On 07.10.21 04:29, Stefano Stabellini wrote:
> 
> Hi Stefano
> 
> > On Wed, 6 Oct 2021, Oleksandr Tyshchenko wrote:
> > > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
> > > 
> > > The extended region (safe range) is a region of guest physical
> > > address space which is unused and could be safely used to create
> > > grant/foreign mappings instead of wasting real RAM pages from
> > > the domain memory for establishing these mappings.
> > > 
> > > The extended regions are chosen at the domain creation time and
> > > advertised to it via "reg" property under hypervisor node in
> > > the guest device-tree. As region 0 is reserved for grant table
> > > space (always present), the indexes for extended regions are 1...N.
> > > If extended regions could not be allocated for some reason,
> > > Xen doesn't fail and behaves as usual, so only inserts region 0.
> > > 
> > > Please note the following limitations:
> > > - The extended region feature is only supported for 64-bit domain
> > >    currently.
> > > - The ACPI case is not covered.
> > > 
> > > ***
> > > 
> > > The algorithm to choose extended regions for non-direct mapped
> > > DomU is simpler in comparison with the algorithm for direct mapped
> > > Dom0. We usually have a lot of unused space above 4GB, and might
> > > have some unused space below 4GB (depends on guest memory size).
> > > Try to allocate separate 2MB-aligned extended regions from the first
> > > (below 4GB) and second (above 4GB) RAM banks taking into the account
> > > the maximum supported guest physical address space size and the amount
> > > of memory assigned to the guest. The minimum size of extended region
> > > the same as for Dom0 (64MB).
> > > 
> > > Suggested-by: Julien Grall <jgrall@xxxxxxxxxx>
> > > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
> > > ---
> > > ! Stefano, Ian, Michal I dropped your A-b/R-b. I decided to change patch
> > > to make it more functional !
> > > 
> > > Changes RFC -> V2:
> > >     - update patch description
> > >     - drop uneeded "extended-region" DT property
> > >     - clear reg array in finalise_ext_region() and add a TODO
> > > 
> > > Changes V2 -> V3:
> > >     - update patch description, comments in code
> > >     - only pick up regions with size >= 64MB
> > >     - move the region calculation to make_hypervisor_node() and drop
> > >       finalise_ext_region()
> > >     - extend the list of arguments for make_hypervisor_node()
> > >     - do not show warning for 32-bit domain
> > >     - change the region alignment from 1GB to 2MB
> > >     - move EXT_REGION_SIZE to public/arch-arm.h
> > > 
> > > Changes V3 -> V4:
> > >     - add R-b, A-b and T-b
> > > 
> > > Changes V4 -> V5:
> > >     - update patch description and comments in code
> > >     - reflect changes done in previous patch to pass gpaddr_bits
> > >       via createdomain domctl (struct xen_arch_domainconfig)
> > >     - drop R-b, A-b and T-b
> > >     - drop limit for maximum extended region size (128GB)
> > >     - try to also allocate region below 4GB, optimize code
> > >       for calculating extended regions
> > > ---
> > >   tools/libs/light/libxl_arm.c  | 80
> > > ++++++++++++++++++++++++++++++++++++++++---
> > >   xen/include/public/arch-arm.h |  2 ++
> > >   2 files changed, 77 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> > > index 45e0386..cd743f7 100644
> > > --- a/tools/libs/light/libxl_arm.c
> > > +++ b/tools/libs/light/libxl_arm.c
> > > @@ -600,9 +600,21 @@ static int make_timer_node(libxl__gc *gc, void *fdt,
> > >       return 0;
> > >   }
> > >   +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
> > > +
> > >   static int make_hypervisor_node(libxl__gc *gc, void *fdt,
> > > -                                const libxl_version_info *vers)
> > > +                                const libxl_version_info *vers,
> > > +                                const libxl_domain_build_info *b_info,
> > > +                                const struct xc_dom_image *dom)
> > >   {
> > > +    uint64_t region_size[GUEST_RAM_BANKS] = {0},
> > > region_base[GUEST_RAM_BANKS],
> > > +        banksize[GUEST_RAM_BANKS], bankend[GUEST_RAM_BANKS], ramsize;
> > > +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
> > > +                  (GUEST_RAM_BANKS + 1)];
> > > +    be32 *cells = &regs[0];
> > > +    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> > > +    unsigned int i, len, nr_regions = 0;
> > > +    uint8_t gpaddr_bits;
> > >       int res;
> > >       gic_interrupt intr;
> > >   @@ -617,9 +629,67 @@ static int make_hypervisor_node(libxl__gc *gc, void
> > > *fdt,
> > >                                 "xen,xen");
> > >       if (res) return res;
> > >   -    /* reg 0 is grant table space */
> > > -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
> > > GUEST_ROOT_SIZE_CELLS,
> > > -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> > > +    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
> > > +        LOG(WARN, "The extended regions are only supported for 64-bit
> > > guest currently");
> > > +        goto out;
> > > +    }
> > > +
> > > +    gpaddr_bits = b_info->arch_arm.gpaddr_bits;
> > > +    assert(gpaddr_bits >= 32 && gpaddr_bits <= 48);
> > > +
> > > +    /*
> > > +     * Try to allocate separate 2MB-aligned extended regions from the
> > > first
> > > +     * (below 4GB) and second (above 4GB) RAM banks taking into the
> > > account
> > > +     * the maximum supported guest physical address space size and the
> > > amount
> > > +     * of memory assigned to the guest.
> > > +     * As the guest memory layout is not populated yet we cannot rely on
> > > +     * dom->rambank_size[], so calculate the actual size of both banks
> > > using
> > > +     * "max_memkb" value.
> > > +     */
> > > +    ramsize = b_info->max_memkb * 1024;
> > > +    if (ramsize <= GUEST_RAM0_SIZE) {
> > > +        banksize[0] = ramsize;
> > > +        banksize[1] = 0;
> > > +    } else {
> > > +        banksize[0] = GUEST_RAM0_SIZE;
> > > +        banksize[1] = ramsize - GUEST_RAM0_SIZE;
> > > +    }
> > > +
> > > +    bankend[0] = GUEST_RAM0_BASE + GUEST_RAM0_SIZE;
> > > +    bankend[1] = min(1ULL << gpaddr_bits, GUEST_RAM1_BASE +
> > > GUEST_RAM1_SIZE);
> > > +
> > > +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
> > > +        region_base[i] = bankbase[i] + ALIGN_UP_TO_2MB(banksize[i]);
> > > +        if (bankend[i] > region_base[i])
> > > +            region_size[i] = bankend[i] - region_base[i];
> > > +    }
> > This seems correct but it looks a bit overkill. I would have written
> > like this:
> > 
> >      if (ramsize <= GUEST_RAM0_SIZE) {
> >          region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
> >          region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
> >          region_base[1] = GUEST_RAM1_BASE;
> >          region_size[1] = min(1ULL << gpaddr_bits, GUEST_RAM1_BASE +
> > GUEST_RAM1_SIZE) -
> >                           region_base[0];
> Why  "- region_base[0]" in last expression? I think it should be "-
> region_base[1]", the same as for "else" case (so it can be moved out of
> if-else construct). Also we need to check
> 
> that min(1ULL << gpaddr_bits, GUEST_RAM1_BASE + GUEST_RAM1_SIZE) is greater
> than region_base[1] before the subtraction. If gpaddr_bits = 32 (on Arm64) we
> will get incorrect result.
> 
> 
> >      } else {
> >          region_size[0] = 0;
> >          region_base[1] = GUEST_RAM1_BASE + ALIGN_UP_TO_2MB(ramsize -
> > GUEST_RAM0_SIZE);
> >          region_size[1] = min(1ULL << gpaddr_bits, GUEST_RAM1_BASE +
> > GUEST_RAM1_SIZE) -
> >                           region_base[1];
> >      }
> > 
> > Which removes the needs for banksize, bankend, bankbase. What do you
> > think? Your version works too, so I am OK anyway.
> Thank you for looking into this.
> 
> I think, you version will also work with adjustments. I am OK either way. Your
> version reduces the number of locals, so probably better.  Although "min(1ULL
> << gpaddr_bits, GUEST_RAM1_BASE + GUEST_RAM1_SIZE)" construction probably
> wants latching in local bank1end.
> 
> 
> Below the updated version:
> 
>     if (ramsize <= GUEST_RAM0_SIZE) {
>         region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
>         region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
>         region_base[1] = GUEST_RAM1_BASE;
>     } else
>         region_base[1] = GUEST_RAM1_BASE + ALIGN_UP_TO_2MB(ramsize - 
> GUEST_RAM0_SIZE);
> 
>     bank1end = min(1ULL << gpaddr_bits, GUEST_RAM1_BASE + GUEST_RAM1_SIZE);
>     if (bank1end > region_base[1])
>         region_size[1] = bank1end - region_base[1];


Yeah I like this. I'd be happy to go with it.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.