[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC][PATCH 01/13] tools: introduce some new parameters to set rdm policy
On Mon, May 11, 2015 at 01:35:06PM +0800, Chen, Tiejun wrote: > On 2015/5/8 21:04, Wei Liu wrote: > >Sorry for the late review. > > > > Really thanks for taking your time :) > > >On Fri, Apr 10, 2015 at 05:21:52PM +0800, Tiejun Chen wrote: > >>This patch introduces user configurable parameters to specify RDM > >>resource and according policies, > >> > >>Global RDM parameter: > >> rdm = [ 'host, reserve=force/try' ] > >>Per-device RDM parameter: > >> pci = [ 'sbdf, rdm_reserve=force/try' ] > >> > >>Global RDM parameter allows user to specify reserved regions explicitly, > >>e.g. using 'host' to include all reserved regions reported on this platform > >>which is good to handle hotplug scenario. In the future this parameter > >>may be further extended to allow specifying random regions, e.g. even > >>those belonging to another platform as a preparation for live migration > >>with passthrough devices. > >> > >>'force/try' policy decides how to handle conflict when reserving RDM > >>regions in pfn space. If conflict exists, 'force' means an immediate error > >>so VM will be killed, while 'try' allows moving forward with a warning > >>message thrown out. > >> > >>Default per-device RDM policy is 'force', while default global RDM policy > >>is 'try'. When both policies are specified on a given region, 'force' is > >>always preferred. > >> > >>Signed-off-by: Tiejun Chen <tiejun.chen@xxxxxxxxx> > >>--- > >> docs/man/xl.cfg.pod.5 | 44 +++++++++++++++++++++++++ > >> docs/misc/vtd.txt | 34 ++++++++++++++++++++ > >> tools/libxl/libxl_create.c | 5 +++ > >> tools/libxl/libxl_types.idl | 18 +++++++++++ > >> tools/libxl/libxlu_pci.c | 78 > >> +++++++++++++++++++++++++++++++++++++++++++++ > >> tools/libxl/libxlutil.h | 4 +++ > >> tools/libxl/xl_cmdimpl.c | 21 +++++++++++- > >> 7 files changed, 203 insertions(+), 1 deletion(-) > >> > >>diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 > >>index 408653f..9ed3055 100644 > >>--- a/docs/man/xl.cfg.pod.5 > >>+++ b/docs/man/xl.cfg.pod.5 > >>@@ -583,6 +583,36 @@ assigned slave device. > >> > >> =back > >> > >>+=item B<rdm=[ "TYPE", "RDM_RESERVE_STRING", ... ]> > >>+ > > > >Shouldn't this be "TYPE,RDM_RESERVE_STRIGN" according to your commit > >message? If the only available config is just one string, you probably > >don't need a list for this? > > Yes, based on that design we don't need a list. So > > =item B<rdm=[ "RDM_RESERVE_STRING" ]> > Note that this is still a list (enclosed by "[]"). Maybe you mean rdm = "RDM_RESERVE_STRING" ? > > > >>+(HVM/x86 only) Specifies the information about Reserved Device Memory > >>(RDM), > >>+which is necessary to enable robust device passthrough usage. One example > >>of > >>+RDM is reported through ACPI Reserved Memory Region Reporting (RMRR) > >>+structure on x86 platform. > >>+Each B<RDM_CHECK_STRING> has the form C<["TYPE",KEY=VALUE,KEY=VALUE,...> > >>where: > >>+ > > > >RDM_CHECK_STRING? > > And here should be corrected like this, > > B<RDM_RESERVE_STRING> has the form ... > > > > >>+=over 4 > >>+ > >>+=item B<"TYPE"> > >>+ > >>+Currently we just have one type. 'host' means all reserved device memory on > >>+this platform should be reserved in this VM's pfn space. > >>+ > > > >What are other possible types? If there is only one type then we can > > Currently we just have one type and looks that design doesn't make this > clear. > > >simply ignore the type? > > I just think we may introduce something else specific to live migration in > the future... But I'm really not sure right now. > Fair enough. I was just wondering if there would be any other types. If so we do need provisioning. In any case, the "type" argument you proposed is a positional argument (you require it to be the first element of the spec string"). I think you can just make it a key-value pair to make parsing easier. > > > >>+=item B<KEY=VALUE> > >>+ > >>+Possible B<KEY>s are: > >>+ > >>+=over 4 > >>+ > >>+=item B<reserve="STRING"> > >>+ > >>+Conflict may be detected when reserving reserved device memory in gfn > >>space. > >>+'force' means an unsolved conflict leads to immediate VM destroy, while > > > >Do you mean "immediate VM crash"? > > Yes. So I guess I should replace this. > > > > >>+'try' allows VM moving forward with a warning message thrown out. 'try' > >>+is default. > > > >Can you please your double quotes for "force", "try" etc. > > Sure. Just note we'd like to use "strict"/"relaxed" to replace "force"/"try" > from next revision according to Jan's suggestion. > No problem. > > > >>+ > >>+Note this may be overrided by another sub item, rdm_reserve, in pci device. > >>+ > >> =item B<pci=[ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ... ]> > >> > >> Specifies the host PCI devices to passthrough to this guest. Each > >> B<PCI_SPEC_STRING> > >>@@ -645,6 +675,20 @@ dom0 without confirmation. Please use with care. > >> D0-D3hot power management states for the PCI device. False (0) by > >> default. > >> > >>+=item B<rdm_check="STRING"> > >>+ > >>+(HVM/x86 only) Specifies the information about Reserved Device Memory > >>(RDM), > >>+which is necessary to enable robust device passthrough usage. One example > >>of > >>+RDM is reported through ACPI Reserved Memory Region Reporting (RMRR) > >>+structure on x86 platform. > >>+ > >>+Conflict may be detected when reserving reserved device memory in gfn > >>space. > >>+'force' means an unsolved conflict leads to immediate VM destroy, while > >>+'try' allows VM moving forward with a warning message thrown out. 'force' > >>+is default. > >>+ > >>+Note this would override another global item, rdm = ['']. > >>+ > > > >Note this would override global B<rdm> option. > > Fixed. > > > > >> =back > >> > >> =back > >>diff --git a/docs/misc/vtd.txt b/docs/misc/vtd.txt > >>index 9af0e99..d7434d6 100644 > >>--- a/docs/misc/vtd.txt > >>+++ b/docs/misc/vtd.txt > >>@@ -111,6 +111,40 @@ in the config file: > >> To override for a specific device: > >> pci = [ '01:00.0,msitranslate=0', '03:00.0' ] > >> > >>+RDM, 'reserved device memory', for PCI Device Passthrough > >>+--------------------------------------------------------- > >>+ > >>+There are some devices the BIOS controls, for e.g. USB devices to perform > >>+PS2 emulation. The regions of memory used for these devices are marked > >>+reserved in the e820 map. When we turn on DMA translation, DMA to those > >>+regions will fail. Hence BIOS uses RMRR to specify these regions along with > >>+devices that need to access these regions. OS is expected to setup > >>+identity mappings for these regions for these devices to access these > >>regions. > >>+ > >>+While creating a VM we should reserve them in advance, and avoid any > >>conflicts. > >>+So we introduce user configurable parameters to specify RDM resource and > >>+according policies, > >>+ > >>+To enable this globally, add "rdm" in the config file: > >>+ > >>+ rdm = [ 'host, reserve=force/try' ] > >>+ > > > >The "force/try" should be called "policy". And then you explain what > >policies we have. > > Do you mean we should rename this? > > rdm = [ 'host, policy=force/try' ] > No, I didn't ask you to rename that. The above line is an example which should reflect the correct syntax. "force/try" is not the *actual syntax*, i.e. you won't write that in your config file. I meant to changes it to "reserve=POLICY". Then you explain the possible values of POLICY. > This is really a policy but 'reserve' may can reflect our action explicitly, > right? > > > > >>+Or just for a specific device: > >>+ > >>+ pci = [ '01:00.0,rdm_reserve=force/try', '03:00.0' ] > > And you also can see this. > > But anyway, if you're really stick to rename this, I'm going to be fine as > well but we should ping every one to check this point since this name is > from our previous discussion. > > >>+ > >>+Global RDM parameter allows user to specify reserved regions explicitly. > >>+Using 'host' to include all reserved regions reported on this platform > >>+which is good to handle hotplug scenario. In the future this parameter > >>+may be further extended to allow specifying random regions, e.g. even > >>+those belonging to another platform as a preparation for live migration > >>+with passthrough devices. > >>+ > >>+'force/try' policy decides how to handle conflict when reserving RDM > >>+regions in pfn space. If conflict exists, 'force' means an immediate error > >>+so VM will be killed, while 'try' allows moving forward with a warning > > > >Be killed by whom? I think it's hvmloader crashes voluntarily, right? > > s/VM will be kille/hvmloader crashes voluntarily > > > > >>+message thrown out. > >>+ > >> > >> Caveat on Conventional PCI Device Passthrough > >> --------------------------------------------- > >>diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c > >>index 98687bd..9ed40d4 100644 > >>--- a/tools/libxl/libxl_create.c > >>+++ b/tools/libxl/libxl_create.c > >>@@ -1407,6 +1407,11 @@ static void domcreate_attach_pci(libxl__egc *egc, > >>libxl__multidev *multidev, > >> } > >> > >> for (i = 0; i < d_config->num_pcidevs; i++) { > >>+ /* > >>+ * If the rdm global policy is 'force' we should override each > >>device. > >>+ */ > >>+ if (d_config->b_info.rdm.reserve == LIBXL_RDM_RESERVE_FLAG_FORCE) > >>+ d_config->pcidevs[i].rdm_reserve = > >>LIBXL_RDM_RESERVE_FLAG_FORCE; > >> ret = libxl__device_pci_add(gc, domid, &d_config->pcidevs[i], 1); > >> if (ret < 0) { > >> LIBXL__LOG(ctx, LIBXL__LOG_ERROR, > >>diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > >>index 47af340..5786455 100644 > >>--- a/tools/libxl/libxl_types.idl > >>+++ b/tools/libxl/libxl_types.idl > >>@@ -71,6 +71,17 @@ libxl_domain_type = Enumeration("domain_type", [ > >> (2, "PV"), > >> ], init_val = "LIBXL_DOMAIN_TYPE_INVALID") > >> > >>+libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [ > >>+ (0, "none"), > >>+ (1, "host"), > >>+ ]) > >>+ > >>+libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [ > >>+ (-1, "invalid"), > >>+ (0, "force"), > >>+ (1, "try"), > >>+ ]) > > > >If you don't set init_val, the default value would be "force" (0), is this > > Yes. > > >want you want? > > We have a little bit of complexity here, > > "Default per-device RDM policy is 'force', while default global RDM policy > is 'try'. When both policies are specified on a given region, 'force' is > always preferred." > This is going to be done in actual code anyway. This type is used both in global and per-device setting, so I envisage this to have an invalid value to start with. Appropriate default values should be done in libxl_TYPE_setdefault functions. And the logic to detect conflict and preferences done in your construct function. What do you think? > > > >>+ > >> libxl_channel_connection = Enumeration("channel_connection", [ > >> (0, "UNKNOWN"), > >> (1, "PTY"), > >>@@ -356,6 +367,11 @@ libxl_domain_sched_params = > >>Struct("domain_sched_params",[ > >> ("budget", integer, {'init_val': > >> 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}), > >> ]) > >> > >>+libxl_rdm_reserve = Struct("rdm_reserve", [ > >>+ ("type", libxl_rdm_reserve_type), > >>+ ("reserve", libxl_rdm_reserve_flag), > >>+ ]) > >>+ > >> libxl_domain_build_info = Struct("domain_build_info",[ > >> ("max_vcpus", integer), > >> ("avail_vcpus", libxl_bitmap), > >>@@ -401,6 +417,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ > >> ("kernel", string), > >> ("cmdline", string), > >> ("ramdisk", string), > >>+ ("rdm", libxl_rdm_reserve), > >> ("u", KeyedUnion(None, libxl_domain_type, "type", > >> [("hvm", Struct(None, [("firmware", string), > >> ("bios", > >> libxl_bios_type), > >>@@ -521,6 +538,7 @@ libxl_device_pci = Struct("device_pci", [ > >> ("power_mgmt", bool), > >> ("permissive", bool), > >> ("seize", bool), > >>+ ("rdm_reserve", libxl_rdm_reserve_flag), > >> ]) > >> > >> libxl_device_vtpm = Struct("device_vtpm", [ > >>diff --git a/tools/libxl/libxlu_pci.c b/tools/libxl/libxlu_pci.c > >>index 26fb143..45be0d9 100644 > >>--- a/tools/libxl/libxlu_pci.c > >>+++ b/tools/libxl/libxlu_pci.c > >>@@ -42,6 +42,8 @@ static int pcidev_struct_fill(libxl_device_pci *pcidev, > >>unsigned int domain, > >> #define STATE_OPTIONS_K 6 > >> #define STATE_OPTIONS_V 7 > >> #define STATE_TERMINAL 8 > >>+#define STATE_TYPE 9 > >>+#define STATE_CHECK_FLAG 10 > >> int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pcidev, const > >> char *str) > >> { > >> unsigned state = STATE_DOMAIN; > >>@@ -143,6 +145,17 @@ int xlu_pci_parse_bdf(XLU_Config *cfg, > >>libxl_device_pci *pcidev, const char *str > >> pcidev->permissive = atoi(tok); > >> }else if ( !strcmp(optkey, "seize") ) { > >> pcidev->seize = atoi(tok); > >>+ }else if ( !strcmp(optkey, "rdm_reserve") ) { > >>+ if ( !strcmp(tok, "force") ) { > >>+ pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_FORCE; > >>+ } else if ( !strcmp(tok, "try") ) { > >>+ pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_TRY; > >>+ } else { > >>+ pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_FORCE; > >>+ XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag > >>value:" > >>+ " %s, so goes 'force' by > >>default.", > > > >If this is not an error, you don't need XLU__PCI_ERR. > > > >But I would say we should just treat this as an error and > >abort/exit/report (whatever the parser should do in this case). > > In our case we just want to post a message to set a appropriate flag to > recover this behavior like we write here, > > XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag > value:" > " %s, so goes 'strict' by > default.", > tok); I suggest we just abort in this case and not second guess what the admin wants. > > This may just be a warning? But I don't we have this sort of definition, > XLU__PCI_WARN, ... > > So what LOG format can be adopted here? Feel free to introduce XLU__PCI_WARN if it turns out to be necessary. > > > > >>+ tok); > >>+ } > >> }else{ > >> XLU__PCI_ERR(cfg, "Unknown PCI BDF option: %s", > >> optkey); > >> } > >>@@ -167,6 +180,71 @@ parse_error: > >> return ERROR_INVAL; > >> } > >> > >>+int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char *str) > >>+{ > >>+ unsigned state = STATE_TYPE; > >>+ char *buf2, *tok, *ptr, *end; > >>+ > >>+ if ( NULL == (buf2 = ptr = strdup(str)) ) > >>+ return ERROR_NOMEM; > >>+ > >>+ for(tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++) { > >>+ switch(state) { Coding style. I haven't checked what actual style this file uses, but there is inconsistency in this function by itself. > >>+ case STATE_TYPE: > >>+ if ( *ptr == '\0' || *ptr == ',' ) { > >>+ state = STATE_CHECK_FLAG; > >>+ *ptr = '\0'; > >>+ if ( !strcmp(tok, "host") ) { > >>+ rdm->type = LIBXL_RDM_RESERVE_TYPE_HOST; > >>+ } else { > >>+ XLU__PCI_ERR(cfg, "Unknown RDM state option: %s", tok); > >>+ goto parse_error; > >>+ } > >>+ tok = ptr + 1; > >>+ } > >>+ break; > >>+ case STATE_CHECK_FLAG: > >>+ if ( *ptr == '=' ) { > >>+ state = STATE_OPTIONS_V; > >>+ *ptr = '\0'; > >>+ if ( strcmp(tok, "reserve") ) { > >>+ XLU__PCI_ERR(cfg, "Unknown RDM property value: %s", > >>tok); > >>+ goto parse_error; > >>+ } > >>+ tok = ptr + 1; > >>+ } > >>+ break; > >>+ case STATE_OPTIONS_V: > >>+ if ( *ptr == ',' || *ptr == '\0' ) { > >>+ state = STATE_TERMINAL; > >>+ *ptr = '\0'; > >>+ if ( !strcmp(tok, "force") ) { > >>+ rdm->reserve = LIBXL_RDM_RESERVE_FLAG_FORCE; > >>+ } else if ( !strcmp(tok, "try") ) { > >>+ rdm->reserve = LIBXL_RDM_RESERVE_FLAG_TRY; > >>+ } else { > >>+ XLU__PCI_ERR(cfg, "Unknown RDM property flag value: > >>%s", > >>+ tok); > >>+ goto parse_error; > >>+ } > >>+ tok = ptr + 1; > >>+ } > >>+ default: > >>+ break; > >>+ } > >>+ } > >>+ > >>+ free(buf2); > >>+ > >>+ if ( tok != ptr || state != STATE_TERMINAL ) > >>+ goto parse_error; > >>+ > >>+ return 0; > >>+ > >>+parse_error: > >>+ return ERROR_INVAL; > >>+} > >>+ > >> /* > >> * Local variables: > >> * mode: C > >>diff --git a/tools/libxl/libxlutil.h b/tools/libxl/libxlutil.h > >>index 0333e55..80497f8 100644 > >>--- a/tools/libxl/libxlutil.h > >>+++ b/tools/libxl/libxlutil.h > >>@@ -93,6 +93,10 @@ int xlu_disk_parse(XLU_Config *cfg, int nspecs, const > >>char *const *specs, > >> */ > >> int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pcidev, const > >> char *str); > >> > >>+/* > >>+ * RDM parsing > >>+ */ > >>+int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char > >>*str); > >> > >> /* > >> * Vif rate parsing. > >>diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c > >>index 04faf98..9a58464 100644 > >>--- a/tools/libxl/xl_cmdimpl.c > >>+++ b/tools/libxl/xl_cmdimpl.c > >>@@ -988,7 +988,7 @@ static void parse_config_data(const char *config_source, > >> const char *buf; > >> long l; > >> XLU_Config *config; > >>- XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms; > >>+ XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms, > >>*rdms; > >> XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian; > >> int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian; > >> int pci_power_mgmt = 0; > >>@@ -1727,6 +1727,23 @@ skip_vfb: > >> xlu_cfg_get_defbool(config, "e820_host", &b_info->u.pv.e820_host, > >> 0); > >> } > >> > >>+ /* > >>+ * By default our global policy is to query all rdm entries, and > >>+ * force reserve them. > >>+ */ > >>+ b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_HOST; > >>+ b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_TRY; > > > >This should probably to into the _setdefault function of > >libxl_domain_build_info. > > Sorry, I just see this > > libxl_domain_build_info_init() > | > + libxl_rdm_reserve_init(&p->rdm); > | > + memset(p, '\0', sizeof(*p)); > > But this should be generated automatically, right? So how to implement your > idea? Could you give me a show? > Check libxl_domain_build_info_setdefault. To use libxl types. You normally do: libxl_TYPE_init libxl_TYPE_setdefault DO STUFF libxl_TYPE_dispose _init and _dispose are auto-generated. _setdefault is not. > > > >>+ if (!xlu_cfg_get_list (config, "rdm", &rdms, 0, 0)) { > >>+ if ((buf = xlu_cfg_get_listitem (rdms, 0)) != NULL ) { > >>+ libxl_rdm_reserve rdm; > >>+ if (!xlu_rdm_parse(config, &rdm, buf)) > >>+ { > >>+ b_info->rdm.type = rdm.type; > >>+ b_info->rdm.reserve = rdm.reserve; > >>+ } > > > >You only have one rdm in b_info, so there is no need to use a list for > >it in config file. > > > > Is this fine? > > if (!xlu_cfg_get_string(config, "rdm", &buf, 0)) { > > libxl_rdm_reserve rdm; > > if (!xlu_rdm_parse(config, &rdm, buf)) { > b_info->rdm.type = rdm.type; > > b_info->rdm.reserve = rdm.reserve; > > } > } > I think it is fine. But you'd better wait a little bit for other people to voice their opinions. Wei. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |