[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v14 3/3] xen/tools: VM forking toolstack side



On Thu, Apr 9, 2020 at 10:22 AM Wei Liu <wl@xxxxxxx> wrote:
>
> On Thu, Apr 09, 2020 at 09:51:35AM -0600, Tamas K Lengyel wrote:
> > On Thu, Apr 9, 2020 at 9:43 AM Wei Liu <wl@xxxxxxx> wrote:
> > >
> > > On Mon, Apr 06, 2020 at 08:20:05AM -0700, Tamas K Lengyel wrote:
> > > > Add necessary bits to implement "xl fork-vm" commands. The command 
> > > > allows the
> > > > user to specify how to launch the device model allowing for a 
> > > > late-launch model
> > > > in which the user can execute the fork without the device model and 
> > > > decide to
> > > > only later launch it.
> > > >
> > > > Signed-off-by: Tamas K Lengyel <tamas.lengyel@xxxxxxxxx>
> > > > ---
> > > [...]
> > > >
> > > > +int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
> > > > +                       libxl__domain_build_state *state,
> > > > +                       uint32_t *domid, bool soft_reset)
> > >
> > > It would have been nice if you could split the refactoring out to a
> > > separate patch.
> >
> > I found the toolstack to be way harder to work with then the
> > hypervisor code-base. Splitting the patches would have been nice but I
> > don't even know how to begin that since it's all such a spaghetti.
>
> In this case you've split out some code from a function to make a new
> function. That is a self-contained task, which can be in its own patch.
>
> >
> > >
> > > >  static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
> > > >                               libxl_domain_build_info *b_info)
> > > >  {
> > > > @@ -1191,16 +1207,32 @@ static void initiate_domain_create(libxl__egc 
> > > > *egc,
> > > >      ret = libxl__domain_config_setdefault(gc,d_config,domid);
> > > >      if (ret) goto error_out;
> > > >
> > > > -    ret = libxl__domain_make(gc, d_config, &dcs->build_state, &domid,
> > > > -                             dcs->soft_reset);
> > > > -    if (ret) {
> > > > -        LOGD(ERROR, domid, "cannot make domain: %d", ret);
> > > > +    if ( !d_config->dm_restore_file )
> > > > +    {
> > > > +        ret = libxl__domain_make(gc, d_config, &dcs->build_state, 
> > > > &domid,
> > > > +                                 dcs->soft_reset);
> > > >          dcs->guest_domid = domid;
> > > > +
> > > > +        if (ret) {
> > > > +            LOGD(ERROR, domid, "cannot make domain: %d", ret);
> > > > +            ret = ERROR_FAIL;
> > > > +            goto error_out;
> > > > +        }
> > > > +    } else if ( dcs->guest_domid != INVALID_DOMID ) {
> > >
> > > Coding style issue here.
> > >
> > > There are quite a few of them.  I won't point them out one by one
> > > though. Let's focus on the interface first.
> >
> > Do we have an automatic formatter we could just run on this code-base?
> > I don't know what style issue you are referring to and given that the
> > coding style here is different here compared to the hypervisor I find
> > it very hard to figure it out what other issues you spotted. Please
> > report them because I won't be able to spot them manually. To me it
> > all looks fine as-is.
>
> I feel your pain. There was work in progress to provide a style checker,
> but we're not there yet.
>
> Xen and libxc share one coding style while libxl and xl share another.
> There is a CODING_STYLE file under libxl directory.  The problem here is
> you should not have spaces inside ().
>
> Generally I think pointing out coding style issues tend to distract
> people from discussing more important things, so I would leave them last
> to fix.

I agree. I would highly prefer if we would get to a spot where they
wouldn't even have to be pointed out other then "run this command to
fix up the style". I submitted an astyle template already for Xen,
others prefer clang to do it, yet it's been like a year and doesn't
look like we will go anywhere with either. Just a waste of time IMHO.

>
> >
> > > >
> > > > +/*
> > > > + * The parent domain is expected to be created with default settings 
> > > > for
> > > > + * - max_evtch_port
> > > > + * - max_grant_frames
> > > > + * - max_maptrack_frames
> > > > + */
> > > > +int libxl_domain_fork_vm(libxl_ctx *ctx, uint32_t pdomid, uint32_t 
> > > > max_vcpus, uint32_t *domid)
> > > > +{
> > > > +    int rc;
> > > > +    struct xen_domctl_createdomain create = {0};
> > > > +    create.flags |= XEN_DOMCTL_CDF_hvm;
> > > > +    create.flags |= XEN_DOMCTL_CDF_hap;
> > > > +    create.flags |= XEN_DOMCTL_CDF_oos_off;
> > > > +    create.arch.emulation_flags = (XEN_X86_EMU_ALL & 
> > > > ~XEN_X86_EMU_VPCI);
> > > > +    create.ssidref = SECINITSID_DOMU;
> > > > +    create.max_vcpus = max_vcpus;
> > >
> > >
> > > Some questions:
> > >
> > > Why does the caller need to specify the number of vcpus?
> > >
> > > Since the parent (source) domain is around, can you retrieve its domain
> > > configuration such that you know its max_vcpus and other information
> > > including max_evtchn_port and maptrack frames?
> >
> > Because we want to avoid having to issue an extra hypercall for these.
> > Normally these pieces of information will be available for the user
> > and won't change, so we save time by not querying for it every time a
> > fork is created. Remember, we might be creating thousands of forks in
> > a very short time, so those extra hypercalls add up.
>
> I see. Speed is a big concern to you.
>
> What I was referring to doesn't require issuing hypercalls but requires
> calling libxl_retrieve_domain_configuration. That's likely to be even
> slower than making a hypercall.

Right. We only want to parse the domain config if the device model is
being launched.

>
> I'm afraid the current incarnation of libxl_domain_fork_vm cannot become
> supported (as in Xen's support statement) going forward, because it is
> leaky.

What do you mean by leaky?

>
> I can see two solutions: 1. batch forking to reduce the number of
> queries needed; 2. make a proper domctl which duplicates the source
> domain structure inside Xen.

I've attempted to do that by extending the domain create hypercall so
that this information can be copied in the hypervisor. That approach
was very unpopular.

>
> Both of these require extra work. I think option 2 is better. But this
> is not asking you to do the work now. See below.
>
> While we want to make libxl APIs stable, we've had cases like COLO which
> we explicitly declared unstable.  Seeing this is a niche feature, this
> probably falls into the same category. In that case we reserve the right
> to rework the interface when necessary.

I'm fine with making this declared unstable. The hypervisor side bits
are also declared experimental and aren't even compiled in the normal
case.

Thanks,
Tamas



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.