[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Sat, 30 Oct 2021 09:53:19 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MVJ1k9E/HU64AYnlrTPywAaZ3jOuUc+JxseXIJ8jVeM=; b=N3ke0DeNSvH/HvWA2J7/08deNhg/TDOOur/ANgjn3cbpJxvIdltAST5tS7EcQNKC1SOVuvchkyMavziFmy2P+UjSnNIhYSwU5OWLVva0zKbuh/3WmL2CgGHUj+Ae4BXRhZhCnLmkv58K7aTvb15FOVMxJ7mv3i29Ze/2p5IJiOmLd0VkB8ZsJiRNqqiELdReaqqS4Jqzg4yWJqMllnC5MKKuUHPuX/qY5IMj3krwVDB1EichqSjXVckvIMSiBx1eftedPLN574Hogbu2dTAoW/uhqU2Up01yQQr+hG1wre2nm6MCE9gNjNSb9zkmStHSBzJINe1AAwur1vKOL7dDjw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fCyKVZvnlpeVtjNCUNoESM1s4trefVVu/tUPn/6sIi95hGxVHKLBJVGf0s2Ys8oZNVBY/ZnQoDRQ7kfY3CO59JIuaqQztldU6XUjTSZg+RP0+aNAPuIUgAiyDIxxu+0Fx+3HzDsvfeHs+oTqdbRIvJ4H4kVlMTNtIoAr+ijbsrhbT6GICa1RqIBH07o0k4szg1klEGfRj/uK5yazHPzzwbDIQBH1HfjUghDboJ8g9nQ8TON+lzOl4PsEjEIMMsa5w6KrWk53QHN2mK3tSj2pG/CovmFeXf6mj5JBap6ukX7UatX36hso/8NtRUrlwdXoNfUiPJNKDGWmW8o7mnNtkw==
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Christian Lindig <christian.lindig@xxxxxxxxxx>, David Scott <dave@xxxxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>
  • Delivery-date: Sat, 30 Oct 2021 07:54:11 +0000
  • Ironport-data: A9a23:8s4hcqChlDCBDBVW/3jlw5YqxClBgxIJ4kV8jS/XYbTApGsqhTZVm 2UZWWnXbqnbNGKhKtBxb4jioR5X6Mfcx4VhQQY4rX1jcSlH+JHPbTi7wuYcHM8wwunrFh8PA xA2M4GYRCwMo/u1Si6FatANl1ElvU2zbue6WLGs1hxZH1c+EX5500M7wIbVv6Yz6TSHK1LV0 T/Ni5W31G+Ng1aY5UpNtspvADs21BjDkGtwUm4WPJinj3eH/5UhN7oNJLnZEpfNatI88thW5 Qr05OrREmvxp3/BAz4++1rxWhVirrX6ZWBihpfKMkQLb9crSiEai84G2PQghUh/sSW5lPEsz I9xqt+Rax8wIPzFtLgFTEwNe81+FfUuFL7vJHG+tYqYzlHccmuqyPJrZK00FdRGoKAtWzgIr KFGbmBWBvyAr7veLLaTUO5ji95lNMD2FIgepmth3XfSCvNOrZXrHvuUuoAEhGZYasZmTa7RS NggZzNTNBXFWy9FIH4HMMgytbL97pX4W2IB8w/EzUYt2EDfxRJ8+KLgO93UfpqNX8o9tkSXv GXd5EziHwoXcteYzFKt822urv/CmzvhX4AfH6H+8eRl6HWtwWgUBAwTREGMi/CzgU6jWPpSM 0URvCEpqMAa71e3R9PwWxm5pn+svRMGXddUVeog52ml1a788wufQG8eQVZpeNEg8cM7WzEu/ luIhM/yQyxitqWPTnCQ/avSqim9UQAXMGsDaCksXQYDpd75r+kbjBjCU9JiG66dlcDuFHf7x DXikcQlr+xN14hRjfz9pA2ZxWL3znTUcuIrzj/HbmmEswcjWJHmRq6w8EaK5NocNJnMGzFto 0M4s8SZ6ekPC7SEmyqMXPgBEdmV2hqVDNHPqQUxRsd8rlxB71bmJNkNu28meC+FJ+5dIWexC HI/rz+983O60JGCSaZsK7y8BM0xpUQLPYS0D6uEBjaij3UYSeNmwM2MTRLPt4wOuBJ1+U3aB Xt9WZz0ZZr9If8+pAdav89HjdcWKtkWnAs/v6zTwRW9yqa5b3WIU7oDO1bmRrlnt/7Z+1+Er I4FbpDiJ/BjvAvWOHG/HWk7dgliEJTGLcqu95w/mhCre1IO9J4d5w/5nup6Jt0Nc1V9nebU5 HCtMnK0O3Kk7UAr3T6iMyg5AJu2BM4XhStiYUQEYAb5s1B+MN3HxPpOKPMKkUwPqbULIQhcF KJeJa1tw51nF1z6xtjqRcCm8dE5K0jz3Wpj/UONOVACQnKpfCSQkvfMdQrz7igeSC2xsMo1u bq70Q3HB5EEQmxf4Az+MZpDFnu94ioQnvxcRUzNLoUBcUng6tEyeSfwkuU2M4cHLhCanmmW0 AOfABE5o+jRotBqrImV1P7c94r5QfFjGkd6HnXA6erkPyft4Wf+k5RLV/yFfG6BWTqsqrmif +hc09r1LOYDwARRq4N5HrsylfA+6tLjqqV01ANhGHmXPV2nBqk5eiuN3NVVt70Lzbhc4FPkV kWK89hcGLOIJMK6TwJBeFt7NryOjKhGlCPT4PI5JFTByBV2pLfXA19POxSsiTBGKOcnOo0S3 up86tUd7Bayi0R2P4/e3DxU7WmFMlcJT74j6sMBGIbuhwcmlgNCbJjbBnOk6Z2DcYwRYEwjI zvSj6venbVMgEHFdiNrR3TK2ONcg7UIuQxLkwBedwjYxIKdi69lxgBV/BQ2Uh9Rn0dO3O9EM 2R2M1F4ePeV9DByickfB22hFmmt3vFCFpAdH7fRqFDkcg==
  • Ironport-hdrordr: A9a23:lea7kam6sl/ilplQ5RrJ4F/J6kHpDfIS3DAbv31ZSRFFG/Fw8P re/sjztCWE6wr5PUtKpTnuAsi9qB/nmqKdgrN8AV7BZmTbUQKTRelfBOPZsljd8kbFltK1u5 0PT0DZYueAaWRSvILf2k2ZCNY7hP2K7aiEjfrXpk0GcT1X
  • Ironport-sdr: HzU8GZT1m77svVQDHoja5CToL11dWaDmNAV9SDJyMRm2ZxMUWSdH32rdwgYtkKWZC3nimkhluA dgJfFEHbcvjMuYKzdx0CFC8KjTmPKdu0AkdtPF2iKUhzEhgoAxXMJst/Nmg+c5IHw/Jh8hb2Gf a2L5kXpH/lBeJCHOxABYGj9zk8fh7rN+GTeMMzcC/Zs7yjcYxiJXoGfIZQymGnxy0+XrCVFhhz FXtHJ2c/Tz7oWcYadkUu/lRqe2Dm11JoEvXIAxvFdtI7Ae3CWnQmKb4VF6/uHGbmAPltvIShbk E2oRZugYecRTd0taa1zdq66l
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Oct 29, 2021 at 05:39:52PM +0100, Andrew Cooper wrote:
> On 29/10/2021 08:59, Roger Pau Monne wrote:
> > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> > index e510395d08..f94f0f272c 100644
> > --- a/xen/common/grant_table.c
> > +++ b/xen/common/grant_table.c
> > @@ -1917,11 +1918,33 @@ active_alloc_failed:
> >  }
> >  
> >  int grant_table_init(struct domain *d, int max_grant_frames,
> > -                     int max_maptrack_frames)
> > +                     int max_maptrack_frames, unsigned int options)
> >  {
> >      struct grant_table *gt;
> > +    unsigned int max_grant_version = options & 
> > XEN_DOMCTL_GRANT_version_mask;
> >      int ret = -ENOMEM;
> >  
> > +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
> > +        max_grant_version = opt_gnttab_max_version;
> > +    if ( !max_grant_version )
> > +    {
> > +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 
> > requested\n",
> > +                d);
> > +        return -EINVAL;
> > +    }
> > +    if ( max_grant_version > opt_gnttab_max_version )
> > +    {
> > +        dprintk(XENLOG_INFO,
> > +                "%pd: requested grant version (%u) greater than supported 
> > (%u)\n",
> > +                d, max_grant_version, opt_gnttab_max_version);
> > +        return -EINVAL;
> > +    }
> 
> I think this wants to live in sanitise_domain_config() along with all
> the other auditing of flags and settings.

The reason to place those there is that the sanity checks for the
other grant table related parameters (max_grant_frames and
max_maptrack_frames) are performed in this function also. I think it's
better to keep the checks together.

We should consider exporting the relevant values from grant table
code and then moving all the checks to sanitise_domain_config, but
likely a follow up work given the current point in the release.

> Also, it can be simplified:
> 
> if ( max_grant_version < 1 ||
>     max_grant_version > opt_gnttab_max_version )
> {
>     dprintk(XENLOG_INFO, "Requested gnttab max version %u outside of
> supported range [%u, %u]\n", ...);
> }

It was originally done this way so that the first check
(!max_grant_version) could be adjusted when support for
max_grant_version == 0 was introduced [0] in order to signal the
disabling of grant tables altogether.

> 
> 
> > +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
> > +         max_grant_version < 2 )
> > +        dprintk(XENLOG_INFO,
> > +                "%pd: host memory above 16Tb and grant table v2 
> > disabled\n",
> > +                d);
> 
> This is rather more complicated.
> 
> For PV, this going wrong in the first place is conditional on CONFIG_BIGMEM.
> For HVM, it the guest address size, not the host.
> For ARM, I don't even know, because I've lost track of which bits of the
> ABI are directmap in an otherwise translated domain.

This was only aiming to cover the PV case, which I think it's the more
likely one. It's possible there's people attempting to create PV
guests on a 16TB machine, but I think it's more unlikely that the
guest itself will have 16TB of RAM.

> I think it is probably useful to do something about it, but probably not
> in this patch.

I'm fine with this, we had no warning at all before, so I don't think
it should be a hard requirement to add one now. It would be nice if
there's consensus, but otherwise let's just skip it.

> Perhaps modify domain_set_alloc_bitsize() to impose an upper limit for
> the "host memory size matters" cases?
> 
> For the guest address size cases, this possibly wants to feed in to the
> max policy calculations in the same way that shadow kinda does.

So grant table version will basically clamp the amount of memory a
guest can use?

What about guests that doesn't use grant tables at all, do we expect
those to set the future max_grant_version to 0 in order to avoid the
clamping without having to expose grant v2?

> > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> > index 51017b47bc..0ec57614bd 100644
> > --- a/xen/include/public/domctl.h
> > +++ b/xen/include/public/domctl.h
> > @@ -89,14 +89,20 @@ struct xen_domctl_createdomain {
> >      /*
> >       * Various domain limits, which impact the quantity of resources
> >       * (global mapping space, xenheap, etc) a guest may consume.  For
> > -     * max_grant_frames and max_maptrack_frames, < 0 means "use the
> > -     * default maximum value in the hypervisor".
> > +     * max_grant_frames, max_maptrack_frames and max_gnttab_version < 0
> > +     * means "use the default maximum value in the hypervisor".
> >       */
> >      uint32_t max_vcpus;
> >      uint32_t max_evtchn_port;
> >      int32_t max_grant_frames;
> >      int32_t max_maptrack_frames;
> >  
> > +/* Grant version, use low 4 bits. */
> > +#define XEN_DOMCTL_GRANT_version_mask    0xf
> > +#define XEN_DOMCTL_GRANT_version_default 0xf
> 
> This needs to be a toolstack decision, not something in Xen.  This
> doesn't fix the case where VMs can't cope with change underfoot.
> 
> It is fine for the user say "use the default", but this must be turned
> into an explicit 1 or 2 by the toolstack, so that the version(s) visible
> to the guest remains invariant while it is booted.

Please bear with me, as I'm afraid I don't understand why this is
relevant. Allowed max grant version can only change as a result of a
migration, and A VM being booted cannot (usually) be migrated, as it
requires guest cooperation (and a fully setup xenstore).

Any guest allowing migration during boot (which is AFAICT the only way
for a max grant table version change) should be capable of handling
the max grant table version changing on resume, whereas this resume
happens in the middle of the boot process is a guest decision, and it
should be capable of handling such changes, or else refuse to suspend
during boot. Any resume process will always involve a
re-initialization of the grant table.

If the intent is to make this compatible with transparent live
migration I think there are also other grant table structures that
will need to be migrated in that case, and thus the version would
already be conveyed there.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.