[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC][PATCH 01/13] tools: introduce some new parameters to set rdm policy



On Fri, May 15, 2015 at 09:52:23AM +0800, Chen, Tiejun wrote:
[...]
> 
> 
> ...
> 
> B<RDM_RESERVE_STRING> has the form C<[KEY=VALUE,KEY=VALUE,...> where:
> 
> =over 4
> 
> =item B<KEY=VALUE>
> 
> Possible B<KEY>s are:
> 
> =over 4
> 
> =item B<type="STRING">
> 
> Currently we just have one type. "host" means all reserved device memory on
> this platform should be reserved in this VM's pfn space.
> 
> =over 4
> 
> =item B<reserve="STRING">
> ...
> 

Yes, something like this.

> 
> >
> >>>
[...]
> >>>>index 9af0e99..d7434d6 100644
> >>>>--- a/docs/misc/vtd.txt
> >>>>+++ b/docs/misc/vtd.txt
> >>>>@@ -111,6 +111,40 @@ in the config file:
> >>>>  To override for a specific device:
> >>>>          pci = [ '01:00.0,msitranslate=0', '03:00.0' ]
> >>>>
> >>>>+RDM, 'reserved device memory', for PCI Device Passthrough
> >>>>+---------------------------------------------------------
> >>>>+
> >>>>+There are some devices the BIOS controls, for e.g. USB devices to perform
> >>>>+PS2 emulation. The regions of memory used for these devices are marked
> >>>>+reserved in the e820 map. When we turn on DMA translation, DMA to those
> >>>>+regions will fail. Hence BIOS uses RMRR to specify these regions along 
> >>>>with
> >>>>+devices that need to access these regions. OS is expected to setup
> >>>>+identity mappings for these regions for these devices to access these 
> >>>>regions.
> >>>>+
> >>>>+While creating a VM we should reserve them in advance, and avoid any 
> >>>>conflicts.
> >>>>+So we introduce user configurable parameters to specify RDM resource and
> >>>>+according policies,
> >>>>+
> >>>>+To enable this globally, add "rdm" in the config file:
> >>>>+
> >>>>+    rdm = [ 'host, reserve=force/try' ]
> >>>>+
> >>>
> >>>The "force/try" should be called "policy". And then you explain what
> >>>policies we have.
> >>
> >>Do you mean we should rename this?
> >>
> >>rdm = [ 'host, policy=force/try' ]
> >>
> >
> >No, I didn't ask you to rename that.
> >
> >The above line is an example which should reflect the correct syntax.
> >"force/try" is not the *actual syntax*, i.e. you won't write that in
> >your config file.
> >
> >I meant to changes it to "reserve=POLICY". Then you explain the possible
> >values of POLICY.
> >
> 
> Understood so what about this,
> 
> To enable this globally, add "rdm" in the config file:
> 
>     rdm = [ 'host, reserve=<POLICY>' ]
> 

OK, so this is a specific example in vtd.txt. Last time I misread it as
part of the manpage.

I think you meant in this specific example (with other suggestions
incorporated):

     rdm = "type=host, reserve=force"

Then you point user to xl.cfg manpage.

> Or just for a specific device:
> 
>     pci = [ '01:00.0,rdm_reserve=<POLICY>', '03:00.0' ]
> 

Same here.

Just don't write "force/try" or "strcit/relax" because that's not the
exact syntax you would use in a real config file.

> Global RDM parameter allows user to specify reserved regions explicitly.
> Using "host" to include all reserved regions reported on this platform
> which is good to handle hotplug scenario. In the future this parameter
> may be further extended to allow specifying random regions, e.g. even
> those belonging to another platform as a preparation for live migration
> with passthrough devices.
> 
> Currently "POLICY" includes two options, "strict" and "relaxed". It decides
> how to handle conflict when reserving RDM regions in pfn space. If conflict
> ...
> 
> >>This is really a policy but 'reserve' may can reflect our action explicitly,
> >>right?
> >>
> >>>
> >>>>+Or just for a specific device:
> >>>>+
> >>>>+ pci = [ '01:00.0,rdm_reserve=force/try', '03:00.0' ]
> >>
> >>And you also can see this.
> >>
> >>But anyway, if you're really stick to rename this, I'm going to be fine as
> >>well but we should ping every one to check this point since this name is
> >>from our previous discussion.
> >>
> >>>>+
> >>>>+Global RDM parameter allows user to specify reserved regions explicitly.
> >>>>+Using 'host' to include all reserved regions reported on this platform
> >>>>+which is good to handle hotplug scenario. In the future this parameter
> >>>>+may be further extended to allow specifying random regions, e.g. even
> >>>>+those belonging to another platform as a preparation for live migration
> >>>>+with passthrough devices.
> >>>>+
> >>>>+'force/try' policy decides how to handle conflict when reserving RDM
> >>>>+regions in pfn space. If conflict exists, 'force' means an immediate 
> >>>>error
> >>>>+so VM will be killed, while 'try' allows moving forward with a warning
> >>>
> >>>Be killed by whom? I think it's hvmloader crashes voluntarily, right?
> >>
> >>s/VM will be kille/hvmloader crashes voluntarily
> >>
> >>>
> >>>>+message thrown out.
> >>>>+
> >>>>
> >>>>  Caveat on Conventional PCI Device Passthrough
> >>>>  ---------------------------------------------
> >>>>diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> >>>>index 98687bd..9ed40d4 100644
> >>>>--- a/tools/libxl/libxl_create.c
> >>>>+++ b/tools/libxl/libxl_create.c
> >>>>@@ -1407,6 +1407,11 @@ static void domcreate_attach_pci(libxl__egc *egc, 
> >>>>libxl__multidev *multidev,
> >>>>      }
> >>>>
> >>>>      for (i = 0; i < d_config->num_pcidevs; i++) {
> >>>>+        /*
> >>>>+         * If the rdm global policy is 'force' we should override each 
> >>>>device.
> >>>>+         */
> >>>>+        if (d_config->b_info.rdm.reserve == LIBXL_RDM_RESERVE_FLAG_FORCE)
> >>>>+            d_config->pcidevs[i].rdm_reserve = 
> >>>>LIBXL_RDM_RESERVE_FLAG_FORCE;
> >>>>          ret = libxl__device_pci_add(gc, domid, &d_config->pcidevs[i], 
> >>>> 1);
> >>>>          if (ret < 0) {
> >>>>              LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
> >>>>diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >>>>index 47af340..5786455 100644
> >>>>--- a/tools/libxl/libxl_types.idl
> >>>>+++ b/tools/libxl/libxl_types.idl
> >>>>@@ -71,6 +71,17 @@ libxl_domain_type = Enumeration("domain_type", [
> >>>>      (2, "PV"),
> >>>>      ], init_val = "LIBXL_DOMAIN_TYPE_INVALID")
> >>>>
> >>>>+libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [
> >>>>+    (0, "none"),
> >>>>+    (1, "host"),
> >>>>+    ])
> >>>>+
> >>>>+libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [
> >>>>+    (-1, "invalid"),
> >>>>+    (0, "force"),
> >>>>+    (1, "try"),
> >>>>+    ])
> >>>
> >>>If you don't set init_val, the default value would be "force" (0), is this
> >>
> >>Yes.
> >>
> >>>want you want?
> >>
> >>We have a little bit of complexity here,
> >>
> >>"Default per-device RDM policy is 'force', while default global RDM policy
> >>is 'try'. When both policies are specified on a given region, 'force' is
> >>always preferred."
> >>
> >
> >This is going to be done in actual code anyway.
> >
> >This type is used both in global and per-device setting, so I envisage
> 
> Yes.
> 
> >this to have an invalid value to start with. Appropriate default values
> 
> Sounds I should set this,
> 
> +libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [
> +    (-1, "invalid"),
> +    (0, "strict"),
> +    (1, "relaxed"),
> +    ], init_val = "LIBXL_RDM_RESERVE_FLAG_INVALID")
> +
> 

Yes, and then don't forget to set the value to appropriate value in the
_setdefault functions for different types.

> 
> >should be done in libxl_TYPE_setdefault functions. And the logic to
> >detect conflict and preferences done in your construct function.
> >
> >What do you think?
> >
> >>>
> >>>>+

[...]

> >>>>                      pcidev->permissive = atoi(tok);
> >>>>                  }else if ( !strcmp(optkey, "seize") ) {
> >>>>                      pcidev->seize = atoi(tok);
> >>>>+                }else if ( !strcmp(optkey, "rdm_reserve") ) {
> >>>>+                    if ( !strcmp(tok, "force") ) {
> >>>>+                        pcidev->rdm_reserve = 
> >>>>LIBXL_RDM_RESERVE_FLAG_FORCE;
> >>>>+                    } else if ( !strcmp(tok, "try") ) {
> >>>>+                        pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_TRY;
> >>>>+                    } else {
> >>>>+                        pcidev->rdm_reserve = 
> >>>>LIBXL_RDM_RESERVE_FLAG_FORCE;
> >>>>+                        XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag 
> >>>>value:"
> >>>>+                                          " %s, so goes 'force' by 
> >>>>default.",
> >>>
> >>>If this is not an error, you don't need XLU__PCI_ERR.
> >>>
> >>>But I would say we should  just treat this as an error and
> >>>abort/exit/report (whatever the parser should do in this case).
> >>
> >>In our case we just want to post a message to set a appropriate flag to
> >>recover this behavior like we write here,
> >>
> >>                         XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag
> >>value:"
> >>                                           " %s, so goes 'strict' by
> >>default.",
> >>                                      tok);
> >
> >I suggest we just abort in this case and not second guess what the admin
> >wants.
> 
> Okay,
>                     } else {
>                         XLU__PCI_ERR(cfg, "%s is not an valid PCI RDM
> property"
>                                           " flag: 'strict' or 'relaxed'.",
>                                      tok);
>                         abort();
> 

No, not calling the "abort" function. I meant returning appropriate error
value and let the caller handles this situation.

> 
> >
> >>
> >>This may just be a warning? But I don't we have this sort of definition,
> >>XLU__PCI_WARN, ...
> >>
> >>So what LOG format can be adopted here?
> >
> >Feel free to introduce XLU__PCI_WARN if it turns out to be necessary.
> 
> If it goes to abort(), I think XLU__PCI_ERR() should be good.
> 
> >
> >>
> >>>
> >>>>+                                     tok);
> >>>>+                    }
> >>>>                  }else{
> >>>>                      XLU__PCI_ERR(cfg, "Unknown PCI BDF option: %s", 
> >>>> optkey);
> >>>>                  }
> >>>>@@ -167,6 +180,71 @@ parse_error:
> >>>>      return ERROR_INVAL;
> >>>>  }
> >>>>
> >>>>+int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char 
> >>>>*str)
> >>>>+{
> >>>>+    unsigned state = STATE_TYPE;
> >>>>+    char *buf2, *tok, *ptr, *end;
> >>>>+
> >>>>+    if ( NULL == (buf2 = ptr = strdup(str)) )
> >>>>+        return ERROR_NOMEM;
> >>>>+
> >>>>+    for(tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++) {
> >>>>+        switch(state) {
> >
> >Coding style. I haven't checked what actual style this file uses, but
> >there is inconsistency in this function by itself.
> 
> I just refer to xlu_pci_parse_bdf() to generate xlu_rdm_parse(), and they
> are in the same file...
> 
> Anyway, I should change this line,
> 
> for ( tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++ ) {
> 

  for (tok = ptr, end...)

  switch (state) {


> >
> >>>>+        case STATE_TYPE:
> >>>>+            if ( *ptr == '\0' || *ptr == ',' ) {
> >>>>+                state = STATE_CHECK_FLAG;
> >>>>+                *ptr = '\0';

[...]

> >>>>
> >>>>+    /*
> >>>>+     * By default our global policy is to query all rdm entries, and
> >>>>+     * force reserve them.
> >>>>+     */
> >>>>+    b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_HOST;
> >>>>+    b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_TRY;
> >>>
> >>>This should probably to into the _setdefault function of
> >>>libxl_domain_build_info.
> >>
> >>Sorry, I just see this
> >>
> >>libxl_domain_build_info_init()
> >>     |
> >>     + libxl_rdm_reserve_init(&p->rdm);
> >>    |
> >>    + memset(p, '\0', sizeof(*p));
> >>
> >>But this should be generated automatically, right? So how to implement your
> >>idea? Could you give me a show?
> >>
> >
> >Check libxl_domain_build_info_setdefault.
> >
> >To use libxl types. You normally do:
> >
> >   libxl_TYPE_init
> >   libxl_TYPE_setdefault
> >
> >   DO STUFF
> >
> >   libxl_TYPE_dispose
> >
> >_init and _dispose are auto-generated. _setdefault is not.
> 
> So in our case, maybe we can do this,
> 
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index f0da7dc..461606c 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -100,6 +100,17 @@ static int sched_params_valid(libxl__gc *gc,
>      return 1;
>  }
> 
> +void libxl__device_rdm_setdefault(libxl__gc *gc,
> +                                  libxl_domain_build_info *b_info)

It's not a device. Use libxl__rdm_setdefault.

> +{
> +    /*
> +     * By default our global policy is to query all rdm entries, and
> +     * force reserve them.
> +     */
> +    b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_HOST;
> +    b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_STRICT;
> +}
> +

Isn't the global policy "relaxed" (or "try")? At least that's what your
old code does. BTW your original code contradicts your original comment.

>  int libxl__domain_build_info_setdefault(libxl__gc *gc,
>                                          libxl_domain_build_info *b_info)
>  {
> @@ -410,6 +421,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>                     libxl_domain_type_to_string(b_info->type));
>          return ERROR_INVAL;
>      }
> +
> +    libxl__device_rdm_setdefault(gc, b_info);
>      return 0;
>  }
> 

And you also need to modify libxl__device_pci_setdefault.

I actually have another question on the interface design. To recap, in
your patch:

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 47af340..5786455 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -71,6 +71,17 @@ libxl_domain_type = Enumeration("domain_type", [
     (2, "PV"),                                                    
     ], init_val = "LIBXL_DOMAIN_TYPE_INVALID")

+libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [
+    (0, "none"),
+    (1, "host"),
+    ])
+
+libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [
+    (-1, "invalid"),
+    (0, "force"),
+    (1, "try"),
+    ])
+
 libxl_channel_connection = Enumeration("channel_connection", [
     (0, "UNKNOWN"),
     (1, "PTY"),    
@@ -356,6 +367,11 @@ libxl_domain_sched_params = Struct("domain_sched_params",[
     ("budget",       integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
     ])

+libxl_rdm_reserve = Struct("rdm_reserve", [
+    ("type",    libxl_rdm_reserve_type),
+    ("reserve",   libxl_rdm_reserve_flag),
+    ])
+
 libxl_domain_build_info = Struct("domain_build_info",[
     ("max_vcpus",       integer),                     
     ("avail_vcpus",     libxl_bitmap),
@@ -401,6 +417,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("kernel",           string),
     ("cmdline",          string),
     ("ramdisk",          string),
+    ("rdm",     libxl_rdm_reserve),
     ("u", KeyedUnion(None, libxl_domain_type, "type",
                 [("hvm", Struct(None, [("firmware",         string),
                                        ("bios",             libxl_bios_type),
@@ -521,6 +538,7 @@ libxl_device_pci = Struct("device_pci", [
     ("power_mgmt", bool),
     ("permissive", bool),
     ("seize", bool),
+    ("rdm_reserve",   libxl_rdm_reserve_flag),
     ])

Do you actually need libxl_rdm_reserve type? I.e. do you envisage that
structure to change a lot? Can you not just use libxl_rdm_reserve_type
and libxl_rdm_reserve_flag in build_info.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.