[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC][PATCH 01/13] tools: introduce some new parameters to set rdm policy
On 2015/5/19 3:17, Wei Liu wrote: On Fri, May 15, 2015 at 09:52:23AM +0800, Chen, Tiejun wrote: [...]... B<RDM_RESERVE_STRING> has the form C<[KEY=VALUE,KEY=VALUE,...> where: =over 4 =item B<KEY=VALUE> Possible B<KEY>s are: =over 4 =item B<type="STRING"> Currently we just have one type. "host" means all reserved device memory on this platform should be reserved in this VM's pfn space. =over 4 =item B<reserve="STRING"> ...Yes, something like this.[...]index 9af0e99..d7434d6 100644 --- a/docs/misc/vtd.txt +++ b/docs/misc/vtd.txt @@ -111,6 +111,40 @@ in the config file: To override for a specific device: pci = [ '01:00.0,msitranslate=0', '03:00.0' ] +RDM, 'reserved device memory', for PCI Device Passthrough +--------------------------------------------------------- + +There are some devices the BIOS controls, for e.g. USB devices to perform +PS2 emulation. The regions of memory used for these devices are marked +reserved in the e820 map. When we turn on DMA translation, DMA to those +regions will fail. Hence BIOS uses RMRR to specify these regions along with +devices that need to access these regions. OS is expected to setup +identity mappings for these regions for these devices to access these regions. + +While creating a VM we should reserve them in advance, and avoid any conflicts. +So we introduce user configurable parameters to specify RDM resource and +according policies, + +To enable this globally, add "rdm" in the config file: + + rdm = [ 'host, reserve=force/try' ] +The "force/try" should be called "policy". And then you explain what policies we have.Do you mean we should rename this? rdm = [ 'host, policy=force/try' ]No, I didn't ask you to rename that. The above line is an example which should reflect the correct syntax. "force/try" is not the *actual syntax*, i.e. you won't write that in your config file. I meant to changes it to "reserve=POLICY". Then you explain the possible values of POLICY.Understood so what about this, To enable this globally, add "rdm" in the config file: rdm = [ 'host, reserve=<POLICY>' ]OK, so this is a specific example in vtd.txt. Last time I misread it as part of the manpage. I think you meant in this specific example (with other suggestions incorporated): rdm = "type=host, reserve=force" Then you point user to xl.cfg manpage.Or just for a specific device: pci = [ '01:00.0,rdm_reserve=<POLICY>', '03:00.0' ]Same here. In order to be compatible with vtd.txt, could this work for you? To enable this globally, add "rdm" in the config file:rdm = [ 'type=host, reserve=<policy>' ] (default policy is "relaxed") Or just for a specific device:pci = [ '01:00.0,rdm_reserve=<policy>' ] (default policy is "strict") Just don't write "force/try" or "strcit/relax" because that's not the exact syntax you would use in a real config file. Yeah. Global RDM parameter allows user to specify reserved regions explicitly. Using "host" to include all reserved regions reported on this platform which is good to handle hotplug scenario. In the future this parameter may be further extended to allow specifying random regions, e.g. even those belonging to another platform as a preparation for live migration with passthrough devices. Currently "POLICY" includes two options, "strict" and "relaxed". It decides how to handle conflict when reserving RDM regions in pfn space. If conflict ...This is really a policy but 'reserve' may can reflect our action explicitly, right?+Or just for a specific device: + + pci = [ '01:00.0,rdm_reserve=force/try', '03:00.0' ]And you also can see this. But anyway, if you're really stick to rename this, I'm going to be fine as well but we should ping every one to check this point since this name is>from our previous discussion.+ +Global RDM parameter allows user to specify reserved regions explicitly. +Using 'host' to include all reserved regions reported on this platform +which is good to handle hotplug scenario. In the future this parameter +may be further extended to allow specifying random regions, e.g. even +those belonging to another platform as a preparation for live migration +with passthrough devices. + +'force/try' policy decides how to handle conflict when reserving RDM +regions in pfn space. If conflict exists, 'force' means an immediate error +so VM will be killed, while 'try' allows moving forward with a warningBe killed by whom? I think it's hvmloader crashes voluntarily, right?s/VM will be kille/hvmloader crashes voluntarily+message thrown out. + Caveat on Conventional PCI Device Passthrough --------------------------------------------- diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 98687bd..9ed40d4 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -1407,6 +1407,11 @@ static void domcreate_attach_pci(libxl__egc *egc, libxl__multidev *multidev, } for (i = 0; i < d_config->num_pcidevs; i++) { + /* + * If the rdm global policy is 'force' we should override each device. + */ + if (d_config->b_info.rdm.reserve == LIBXL_RDM_RESERVE_FLAG_FORCE) + d_config->pcidevs[i].rdm_reserve = LIBXL_RDM_RESERVE_FLAG_FORCE; ret = libxl__device_pci_add(gc, domid, &d_config->pcidevs[i], 1); if (ret < 0) { LIBXL__LOG(ctx, LIBXL__LOG_ERROR, diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 47af340..5786455 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -71,6 +71,17 @@ libxl_domain_type = Enumeration("domain_type", [ (2, "PV"), ], init_val = "LIBXL_DOMAIN_TYPE_INVALID") +libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [ + (0, "none"), + (1, "host"), + ]) + +libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [ + (-1, "invalid"), + (0, "force"), + (1, "try"), + ])If you don't set init_val, the default value would be "force" (0), is thisYes.want you want?We have a little bit of complexity here, "Default per-device RDM policy is 'force', while default global RDM policy is 'try'. When both policies are specified on a given region, 'force' is always preferred."This is going to be done in actual code anyway. This type is used both in global and per-device setting, so I envisageYes.this to have an invalid value to start with. Appropriate default valuesSounds I should set this, +libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [ + (-1, "invalid"), + (0, "strict"), + (1, "relaxed"), + ], init_val = "LIBXL_RDM_RESERVE_FLAG_INVALID") +Yes, and then don't forget to set the value to appropriate value in the _setdefault functions for different types. Currently "type" is not associated to "policy" so we can do this if necessary in the future. should be done in libxl_TYPE_setdefault functions. And the logic to detect conflict and preferences done in your construct function. What do you think?+[...]pcidev->permissive = atoi(tok); }else if ( !strcmp(optkey, "seize") ) { pcidev->seize = atoi(tok); + }else if ( !strcmp(optkey, "rdm_reserve") ) { + if ( !strcmp(tok, "force") ) { + pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_FORCE; + } else if ( !strcmp(tok, "try") ) { + pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_TRY; + } else { + pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_FORCE; + XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag value:" + " %s, so goes 'force' by default.",If this is not an error, you don't need XLU__PCI_ERR. But I would say we should just treat this as an error and abort/exit/report (whatever the parser should do in this case).In our case we just want to post a message to set a appropriate flag to recover this behavior like we write here, XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag value:" " %s, so goes 'strict' by default.", tok);I suggest we just abort in this case and not second guess what the admin wants.Okay, } else { XLU__PCI_ERR(cfg, "%s is not an valid PCI RDM property" " flag: 'strict' or 'relaxed'.", tok); abort();No, not calling the "abort" function. I meant returning appropriate error value and let the caller handles this situation. Okay, just call "goto parse_error". This may just be a warning? But I don't we have this sort of definition, XLU__PCI_WARN, ... So what LOG format can be adopted here?Feel free to introduce XLU__PCI_WARN if it turns out to be necessary.If it goes to abort(), I think XLU__PCI_ERR() should be good.+ tok); + } }else{ XLU__PCI_ERR(cfg, "Unknown PCI BDF option: %s", optkey); } @@ -167,6 +180,71 @@ parse_error: return ERROR_INVAL; } +int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char *str) +{ + unsigned state = STATE_TYPE; + char *buf2, *tok, *ptr, *end; + + if ( NULL == (buf2 = ptr = strdup(str)) ) + return ERROR_NOMEM; + + for(tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++) { + switch(state) {Coding style. I haven't checked what actual style this file uses, but there is inconsistency in this function by itself.I just refer to xlu_pci_parse_bdf() to generate xlu_rdm_parse(), and they are in the same file... Anyway, I should change this line, for ( tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++ ) {for (tok = ptr, end...) switch (state) { But what is the difference to compare the initial code? >>>>>> + if ( NULL == (buf2 = ptr = strdup(str)) ) >>>>>> + return ERROR_NOMEM; >>>>>> + >>>>>> + for(tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++) { >>>>>> + switch(state) { I thought initially you let me to follow that previous "if" :) + case STATE_TYPE: + if ( *ptr == '\0' || *ptr == ',' ) { + state = STATE_CHECK_FLAG; + *ptr = '\0';[...]+ /* + * By default our global policy is to query all rdm entries, and + * force reserve them. + */ + b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_HOST; + b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_TRY;This should probably to into the _setdefault function of libxl_domain_build_info.Sorry, I just see this libxl_domain_build_info_init() | + libxl_rdm_reserve_init(&p->rdm); | + memset(p, '\0', sizeof(*p)); But this should be generated automatically, right? So how to implement your idea? Could you give me a show?Check libxl_domain_build_info_setdefault. To use libxl types. You normally do: libxl_TYPE_init libxl_TYPE_setdefault DO STUFF libxl_TYPE_dispose _init and _dispose are auto-generated. _setdefault is not.So in our case, maybe we can do this, diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index f0da7dc..461606c 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -100,6 +100,17 @@ static int sched_params_valid(libxl__gc *gc, return 1; } +void libxl__device_rdm_setdefault(libxl__gc *gc, + libxl_domain_build_info *b_info)It's not a device. Use libxl__rdm_setdefault. Okay. +{ + /* + * By default our global policy is to query all rdm entries, and + * force reserve them. + */ + b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_HOST; + b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_STRICT; +} +Isn't the global policy "relaxed" (or "try")? At least that's what your old code does. BTW your original code contradicts your original comment. Right, sorry for this typo. int libxl__domain_build_info_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info) { @@ -410,6 +421,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, libxl_domain_type_to_string(b_info->type)); return ERROR_INVAL; } + + libxl__device_rdm_setdefault(gc, b_info); return 0; }And you also need to modify libxl__device_pci_setdefault. Okay. I actually have another question on the interface design. To recap, in your patch: diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 47af340..5786455 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -71,6 +71,17 @@ libxl_domain_type = Enumeration("domain_type", [ (2, "PV"), ], init_val = "LIBXL_DOMAIN_TYPE_INVALID") +libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [ + (0, "none"), + (1, "host"), + ]) + +libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [ + (-1, "invalid"), + (0, "force"), + (1, "try"), + ]) + libxl_channel_connection = Enumeration("channel_connection", [ (0, "UNKNOWN"), (1, "PTY"), @@ -356,6 +367,11 @@ libxl_domain_sched_params = Struct("domain_sched_params",[ ("budget", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}), ]) +libxl_rdm_reserve = Struct("rdm_reserve", [ + ("type", libxl_rdm_reserve_type), + ("reserve", libxl_rdm_reserve_flag), + ]) + libxl_domain_build_info = Struct("domain_build_info",[ ("max_vcpus", integer), ("avail_vcpus", libxl_bitmap), @@ -401,6 +417,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("kernel", string), ("cmdline", string), ("ramdisk", string), + ("rdm", libxl_rdm_reserve), ("u", KeyedUnion(None, libxl_domain_type, "type", [("hvm", Struct(None, [("firmware", string), ("bios", libxl_bios_type), @@ -521,6 +538,7 @@ libxl_device_pci = Struct("device_pci", [ ("power_mgmt", bool), ("permissive", bool), ("seize", bool), + ("rdm_reserve", libxl_rdm_reserve_flag), ]) Do you actually need libxl_rdm_reserve type? I.e. do you envisage that structure to change a lot? Can you not just use libxl_rdm_reserve_type and libxl_rdm_reserve_flag in build_info. We'd like to introduce this type, libxl_rdm_reserve, to combine "type" and "flag". From my point of view, this sole structure can represent a holistic approach to rdm because, #1. Obviously its easy to get all;#2. It will probably be extended since like this name, rdm, reserved device memory, this should not be restricted to RMRR currently. So I just feel its flexible to support others in the future, or much more properties :) Thanks Tiejun _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |