[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Qemu-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap



On Mon, 1 Dec 2014, Don Slutz wrote:
> On 12/01/14 10:37, Stefano Stabellini wrote:
> > On Mon, 1 Dec 2014, Don Slutz wrote:
> > > On 11/27/14 05:48, Stefano Stabellini wrote:
> 
> [...]
> 
> > > 
> > > Works fine in both claim modes and with PoD used (maxmem > memory).  Do
> > > not know how to test with tmem.  I do not see how it would be worse then
> > > current
> > > code that does not auto increase.  I.E. even without a xen change, I think
> > > something
> > > like this could be done.
> > OK, good to know. I am OK with increasing maxmem only if it is strictly
> > necessary.
> > 
> > 
> > > > > My testing shows a free 32 pages that I am not sure where they come
> > > > > from.
> > > > > But
> > > > > the code about is passing my 8 nics of e1000.
> > > > I think that raising maxmem a bit higher than necessary is not too bad.
> > > > If we really care about it, we could lower the limit after QEMU's
> > > > initialization is completed.
> > > Ok.  I did find the 32 it is VGA_HOLE_SIZE.  So here is what I have which
> > > includes
> > > a lot of extra printf.
> > In QEMU I would prefer not to assume that libxl increased maxmem for the
> > vga hole. I would rather call xc_domain_setmaxmem twice for the vga hole
> > than tie QEMU to a particular maxmem allocation scheme in libxl.
> 
> Ok.  The area we are talking about is 0x000a0000 to 0x000c0000.
> It is in libxc (xc_hvm_build_x86), not libxl.   I have no issue with a name
> change to
> some thing like QEMU_EXTRA_FREE_PAGES.

Sorry, I was thinking about the relocated videoram region, the one
allocated by xen_add_to_physmap in QEMU.

Regarding 0x000a0000-0x000c0000, according to this comment:

    /*
     * Allocate memory for HVM guest, skipping VGA hole 0xA0000-0xC0000.
     *

xc_hvm_build_x86 shouldn't be allocating memory for it.
In fact if I remember correctly 0xa0000-0xc0000 is trapped and emulated.


> My testing has shows that some of these 32 pages are used outside of QEMU.
> I am seeing just 23 free pages using a standalone program to display
> the same info after a CentOS 6.3 guest is done booting.
>
> > In libxl I would like to avoid increasing mamxem for anything QEMU will
> > allocate later, that includes rom and vga vram. I am not sure how to
> > make that work with older QEMU versions that don't call
> > xc_domain_setmaxmem by themselves yet though. Maybe we could check the
> > specific QEMU version in libxl and decide based on that. Or we could
> > export a feature flag in QEMU.
> 
> Yes, it would be nice to adjust libxl to not increase maxmem. However since
> videoram is included in memory (and maxmem), making the change related
> to vram is a bigger issue.
> 
> the rom change is much simpler.
> 
> Currently I do not know of a way to do different things based on the QEMU
> version
> and/or features (this includes getting the QEMU version in libxl).
> 
> I have been going with:
> 1) change QEMU 1st.
> 2) Wait for an upstream version of QEMU with this.
> 3) change xen to optionally use a feature in the latest QEMU.

Let's try to take this approach for the videoram too


> > 
> > > --- a/xen-hvm.c
> > > +++ b/xen-hvm.c
> > > @@ -67,6 +67,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t
> > > *shared_page, int vcpu)
> > >   #endif
> > > 
> > >   #define BUFFER_IO_MAX_DELAY  100
> > > +#define VGA_HOLE_SIZE (0x20)
> > > 
> > >   typedef struct XenPhysmap {
> > >       hwaddr start_addr;
> > > @@ -219,6 +220,11 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t
> > > size,
> > > MemoryRegion *mr)
> > >       xen_pfn_t *pfn_list;
> > >       int i;
> > >       xc_dominfo_t info;
> > > +    unsigned long max_pages, free_pages, real_free;
> > > +    long need_pages;
> > > +    uint64_t tot_pages, pod_cache_pages, pod_entries;
> > > +
> > > +    trace_xen_ram_alloc(ram_addr, size, mr->name);
> > > 
> > >       if (runstate_check(RUN_STATE_INMIGRATE)) {
> > >           /* RAM already populated in Xen */
> > > @@ -232,13 +238,6 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t
> > > size,
> > > MemoryRegion *mr)
> > >           return;
> > >       }
> > > 
> > > -    fprintf(stderr, "%s: alloc "RAM_ADDR_FMT
> > > -            " bytes (%ld Kib) of ram at "RAM_ADDR_FMT
> > > -            " mr.name=%s\n",
> > > -            __func__, size, (long)(size>>10), ram_addr, mr->name);
> > > -
> > > -    trace_xen_ram_alloc(ram_addr, size);
> > > -
> > >       nr_pfn = size >> TARGET_PAGE_BITS;
> > >       pfn_list = g_malloc(sizeof (*pfn_list) * nr_pfn);
> > > 
> > > @@ -246,11 +245,38 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t
> > > size,
> > > MemoryRegion *mr)
> > >           pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i;
> > >       }
> > > 
> > > -    if (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) < 0) {
> > > +    if ((xc_domain_getinfo(xen_xc, xen_domid, 1, &info) != 1) ||

I think it's best to call xc_domain_getinfolist.


> > > +       (info.domid != xen_domid)) {
> > >           hw_error("xc_domain_getinfo failed");
> > >       }
> > > -    if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
> > > -                            (nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
> > > +    max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
> > > +    free_pages = max_pages - info.nr_pages;
> > > +    real_free = free_pages;
> > > +    if (free_pages > VGA_HOLE_SIZE) {
> > > +        free_pages -= VGA_HOLE_SIZE;
> > > +    } else {
> > > +        free_pages = 0;
> > > +    }

I don't think we need to subtract VGA_HOLE_SIZE.


> > > +    need_pages = nr_pfn - free_pages;
> > > +    fprintf(stderr, "%s: alloc "RAM_ADDR_FMT
> > > +            " bytes (%ld KiB) of ram at "RAM_ADDR_FMT
> > > +            " mp=%ld fp=%ld nr=%ld need=%ld mr.name=%s\n",
> > > +            __func__, size, (long)(size>>10), ram_addr,
> > > +           max_pages, free_pages, nr_pfn, need_pages,
> > > +            mr->name);
> > > +    if (xc_domain_get_pod_target(xen_xc, xen_domid, &tot_pages,
> > > +                                 &pod_cache_pages, &pod_entries) >= 0) {
> > > +        unsigned long populated = tot_pages - pod_cache_pages;
> > > +        long delta_tot = tot_pages - info.nr_pages;
> > > +
> > > +        fprintf(stderr, "%s: PoD pop=%ld tot=%ld(%ld) cnt=%ld ent=%ld
> > > nop=%ld
> > > free=%ld\n",
> > > +                __func__, populated, (long)tot_pages, delta_tot,
> > > +                (long)pod_cache_pages, (long)pod_entries,
> > > +               (long)info.nr_outstanding_pages, real_free);
> > > +    }
> > What is the purpose of calling xc_domain_get_pod_target here? It doesn't
> > look like is affecting the parameters passed to xc_domain_setmaxmem.
> 
> This was just to test and see if anything was needed for PoD mode.
> I did not see anything.
> 
> Did you want me to make a patch to send to the list that has my proposed
> changes?

Yep, that's fine.


> I do think that what I have would be fine to do since most of the time the
> new code does nothing with the current xen (and older versions), until
> more then 4 nics are configured for xen.
> 
> It would be good to have the change:
> 
> [PATCH] libxl_set_memory_target: retain the same maxmem offset on top of the
> current target
> 
> (When working) in xen before using a QEMU with this change.
>
> Not sure I would push for 2.2 however.

I wouldn't.


>    -Don Slutz
> 
> > 
> > > +    if ((free_pages < nr_pfn) &&
> > > +       (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
> > > +                            (need_pages * XC_PAGE_SIZE / 1024)) < 0)) {
> > >           hw_error("xc_domain_setmaxmem failed");
> > >       }
> > >       if (xc_domain_populate_physmap_exact(xen_xc, xen_domid, nr_pfn, 0,
> > > 0,
> > > pfn_list)) {
> > > 
> > > 
> > > Note: I already had a trace_xen_ram_alloc, here is the delta in
> > > trace-events
> > > (which
> > > has others not yet sent):
> > > 
> > > 
> > > 
> > > diff --git a/trace-events b/trace-events
> > > index eaeecd5..a58fc11 100644
> > > --- a/trace-events
> > > +++ b/trace-events
> > > @@ -908,7 +908,7 @@ pvscsi_tx_rings_ppn(const char* label, uint64_t ppn)
> > > "%s
> > > page: %"PRIx64""
> > >   pvscsi_tx_rings_num_pages(const char* label, uint32_t num) "Number of %s
> > > pages: %u"
> > > 
> > >   # xen-all.c
> > > -xen_ram_alloc(unsigned long ram_addr, unsigned long size) "requested:
> > > %#lx,
> > > size %#lx"
> > > +xen_ram_alloc(unsigned long ram_addr, unsigned long size, const char*
> > > mr_name) "requested: %#lx size %#lx mr->name=%s"
> > >   xen_client_set_memory(uint64_t start_addr, unsigned long size, bool
> > > log_dirty) "%#"PRIx64" size %#lx, log_dirty %i"
> > >   handle_ioreq(void *req, uint32_t type, uint32_t dir, uint32_t df,
> > > uint32_t
> > > data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size)
> > > "I/O=%p type=%d dir=%d df=%d ptr=%d port=%#"PRIx64" data=%#"PRIx64"
> > > count=%d
> > > size=%d"
> > >   handle_ioreq_read(void *req, uint32_t type, uint32_t df, uint32_t
> > > data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size)
> > > "I/O=%p read type=%d df=%d ptr=%d port=%#"PRIx64" data=%#"PRIx64" count=%d
> > > size=%d"
> > > 
> > > 
> > >     -Don Slutz
> > > 
> > > > >      -Don Slutz
> > > > > 
> > > > > 
> > > > > > > > +        hw_error("xc_domain_setmaxmem failed");
> > > > > > > > +    }
> > > > > > > >         if (xc_domain_populate_physmap_exact(xen_xc, xen_domid,
> > > > > > > > nr_pfn, 0,
> > > > > > > > 0, pfn_list)) {
> > > > > > > >             hw_error("xen: failed to populate ram at "
> > > > > > > > RAM_ADDR_FMT,
> > > > > > > > ram_addr);
> > > > > > > >         }
> > > > > > > > 
> > > > > > > > _______________________________________________
> > > > > > > > Xen-devel mailing list
> > > > > > > > Xen-devel@xxxxxxxxxxxxx
> > > > > > > > http://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.