[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/3] x86/mem_sharing: option to skip populating special pages during fork



On Thu, Mar 24, 2022 at 11:51 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Thu, Mar 24, 2022 at 11:15:08AM -0400, Tamas K Lengyel wrote:
> > On Thu, Mar 24, 2022 at 10:50 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> 
> > wrote:
> > >
> > > On Tue, Mar 22, 2022 at 01:41:37PM -0400, Tamas K Lengyel wrote:
> > > > Add option to the fork memop to skip populating the fork with special 
> > > > pages.
> > > > These special pages are only necessary when setting up forks to be fully
> > > > functional with a toolstack. For short-lived forks where no toolstack 
> > > > is active
> > > > these pages are uneccesary.
> > >
> > > I'm not sure those are strictly related to having a toolstack. For
> > > example the vcpu_info has nothing to do with having a toolstack, and
> > > is only used by the guest in order to receive events or fetch time
> > > related data. So while a short-lived fork might not make use of those,
> > > that has nothing to do with having a toolstack or not.
> >
> > Fair enough, the point is that the short live fork doesn't use these pages.
> >
> > > >
> > > > Signed-off-by: Tamas K Lengyel <tamas.lengyel@xxxxxxxxx>
> > > > ---
> > > >  xen/arch/x86/include/asm/hvm/domain.h |  4 +++-
> > > >  xen/arch/x86/mm/mem_sharing.c         | 33 +++++++++++++++++----------
> > > >  xen/include/public/memory.h           |  4 ++--
> > > >  3 files changed, 26 insertions(+), 15 deletions(-)
> > > >
> > > > diff --git a/xen/arch/x86/include/asm/hvm/domain.h 
> > > > b/xen/arch/x86/include/asm/hvm/domain.h
> > > > index 698455444e..446cd06411 100644
> > > > --- a/xen/arch/x86/include/asm/hvm/domain.h
> > > > +++ b/xen/arch/x86/include/asm/hvm/domain.h
> > > > @@ -31,7 +31,9 @@
> > > >  #ifdef CONFIG_MEM_SHARING
> > > >  struct mem_sharing_domain
> > > >  {
> > > > -    bool enabled, block_interrupts;
> > > > +    bool enabled;
> > > > +    bool block_interrupts;
> > > > +    bool skip_special_pages;
> > > >
> > > >      /*
> > > >       * When releasing shared gfn's in a preemptible manner, recall 
> > > > where
> > > > diff --git a/xen/arch/x86/mm/mem_sharing.c 
> > > > b/xen/arch/x86/mm/mem_sharing.c
> > > > index 15e6a7ed81..84c04ddfa3 100644
> > > > --- a/xen/arch/x86/mm/mem_sharing.c
> > > > +++ b/xen/arch/x86/mm/mem_sharing.c
> > > > @@ -1643,7 +1643,8 @@ static int bring_up_vcpus(struct domain *cd, 
> > > > struct domain *d)
> > > >      return 0;
> > > >  }
> > > >
> > > > -static int copy_vcpu_settings(struct domain *cd, const struct domain 
> > > > *d)
> > > > +static int copy_vcpu_settings(struct domain *cd, const struct domain 
> > > > *d,
> > > > +                              bool skip_special_pages)
> > > >  {
> > > >      unsigned int i;
> > > >      struct p2m_domain *p2m = p2m_get_hostp2m(cd);
> > > > @@ -1660,7 +1661,7 @@ static int copy_vcpu_settings(struct domain *cd, 
> > > > const struct domain *d)
> > > >
> > > >          /* Copy & map in the vcpu_info page if the guest uses one */
> > > >          vcpu_info_mfn = d_vcpu->vcpu_info_mfn;
> > > > -        if ( !mfn_eq(vcpu_info_mfn, INVALID_MFN) )
> > > > +        if ( !skip_special_pages && !mfn_eq(vcpu_info_mfn, 
> > > > INVALID_MFN) )
> > > >          {
> > > >              mfn_t new_vcpu_info_mfn = cd_vcpu->vcpu_info_mfn;
> > > >
> > > > @@ -1807,17 +1808,18 @@ static int copy_special_pages(struct domain 
> > > > *cd, struct domain *d)
> > > >      return 0;
> > > >  }
> > > >
> > > > -static int copy_settings(struct domain *cd, struct domain *d)
> > > > +static int copy_settings(struct domain *cd, struct domain *d,
> > > > +                         bool skip_special_pages)
> > > >  {
> > > >      int rc;
> > > >
> > > > -    if ( (rc = copy_vcpu_settings(cd, d)) )
> > > > +    if ( (rc = copy_vcpu_settings(cd, d, skip_special_pages)) )
> > > >          return rc;
> > > >
> > > >      if ( (rc = hvm_copy_context_and_params(cd, d)) )
> > > >          return rc;
> > > >
> > > > -    if ( (rc = copy_special_pages(cd, d)) )
> > > > +    if ( !skip_special_pages && (rc = copy_special_pages(cd, d)) )
> > > >          return rc;
> > > >
> > > >      copy_tsc(cd, d);
> > > > @@ -1826,9 +1828,11 @@ static int copy_settings(struct domain *cd, 
> > > > struct domain *d)
> > > >      return rc;
> > > >  }
> > > >
> > > > -static int fork(struct domain *cd, struct domain *d)
> > > > +static int fork(struct domain *cd, struct domain *d, uint16_t flags)
> > > >  {
> > > >      int rc = -EBUSY;
> > > > +    bool block_interrupts = flags & XENMEM_FORK_BLOCK_INTERRUPTS;
> > > > +    bool skip_special_pages = flags & XENMEM_FORK_SKIP_SPECIAL_PAGES;
> > > >
> > > >      if ( !cd->controller_pause_count )
> > > >          return rc;
> > > > @@ -1856,7 +1860,13 @@ static int fork(struct domain *cd, struct domain 
> > > > *d)
> > > >      if ( (rc = bring_up_vcpus(cd, d)) )
> > > >          goto done;
> > > >
> > > > -    rc = copy_settings(cd, d);
> > > > +    if ( !(rc = copy_settings(cd, d, skip_special_pages)) )
> > >
> > > Can you set
> > > cd->arch.hvm.mem_sharing.{block_interrupts,skip_special_pages} earlier
> > > so that you don't need to pass the options around to copy_settings and
> > > copy_vcpu_settings?
> >
> > Would be possible yes, but then we would have to clear them in case
> > the forking failed at any point. Setting them only at the end when the
> > fork finished ensures that those fields are only ever valid if the VM
> > is a fork. Both are valid approaches and I prefer this over the other.
>
> I think I'm confused, doesn't the fork get destroyed if there's a
> failure? In which case the values
> cd->arch.hvm.mem_sharing.{block_interrupts,skip_special_pages}
> shouldn't really matter?

No, the domain that will be a fork is just a regular domain until the
fork operation completes. If the fork operation fails the domain
remains.

> > >
> > > > +    {
> > > > +        cd->arch.hvm.mem_sharing.block_interrupts = block_interrupts;
> > > > +        cd->arch.hvm.mem_sharing.skip_special_pages = 
> > > > skip_special_pages;
> > > > +        /* skip mapping the vAPIC page on unpause if skipping special 
> > > > pages */
> > > > +        cd->creation_finished = skip_special_pages;
> > >
> > > I think this is dangerous. While it might be true at the moment that
> > > the arch_domain_creation_finished only maps the vAPIC page, there's no
> > > guarantee it couldn't do other stuff in the future that could be
> > > required for the VM to be started.
> >
> > I understand this domain_creation_finished route could be expanded in
> > the future to include other stuff, but IMHO we can evaluate what to do
> > about that when and if it becomes necessary. This is all experimental
> > to begin with.
>
> Maybe you could check the skip_special_pages field from
> domain_creation_finished in order to decide whether the vAPIC page
> needs to be mapped?

Could certainly do that but it means adding experimental code in an
#ifdef to the vAPIC mapping logic. Technically nothing wrong with that
but I would prefer keeping all this code in a single-place if
possible.

Tamas



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.