[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC 2/8] public/io/netif: add directory for backend parameters
On 02/08/2018 11:13 AM, Wei Liu wrote: > On Wed, Feb 07, 2018 at 12:10:37PM +0000, Joao Martins wrote: >> On 02/06/2018 05:12 PM, Wei Liu wrote: >>> (Three months after you sent this, sorry...) >>> >>> On Mon, Nov 06, 2017 at 12:33:06PM +0000, Joao Martins wrote: >>>> On Mon, Nov 06, 2017 at 10:33:59AM +0000, Paul Durrant wrote: >>>>>> -----Original Message----- >>>>>> From: Joao Martins [mailto:joao.m.martins@xxxxxxxxxx] >>>>>> Sent: 02 November 2017 18:06 >>>>>> To: Xen Development List <xen-devel@xxxxxxxxxxxxx> >>>>>> Cc: Joao Martins <joao.m.martins@xxxxxxxxxx>; Konrad Rzeszutek Wilk >>>>>> <konrad.wilk@xxxxxxxxxx>; Paul Durrant <Paul.Durrant@xxxxxxxxxx>; Wei Liu >>>>>> <wei.liu2@xxxxxxxxxx> >>>>>> Subject: [PATCH RFC 2/8] public/io/netif: add directory for backend >>>>>> parameters >>>>>> >>>>>> The proposed directory provides a mechanism for tools to control the >>>>>> maximum feature set of the device being provisioned by backend. >>>>>> The parameters/features include offloading features, number of >>>>>> queues etc. >>>>>> >>>>>> Signed-off-by: Joao Martins <joao.m.martins@xxxxxxxxxx> >>>>>> --- >>>>>> xen/include/public/io/netif.h | 16 ++++++++++++++++ >>>>>> 1 file changed, 16 insertions(+) >>>>>> >>>>>> diff --git a/xen/include/public/io/netif.h >>>>>> b/xen/include/public/io/netif.h >>>>>> index 2454448baa..a412e4771d 100644 >>>>>> --- a/xen/include/public/io/netif.h >>>>>> +++ b/xen/include/public/io/netif.h >>>>>> @@ -161,6 +161,22 @@ >>>>>> */ >>>>>> >>>>>> /* >>>>>> + * The directory "require" maybe be created in backend path by tools >>>>>> + * domain to override the maximum feature set that backend provides to >>>>>> the >>>>>> + * frontend. The children entries within this directory are features >>>>>> names >>>>>> + * and the correspondent values that should be used backend as defaults >>>>>> e.g.: >>>>>> + * >>>>>> + * /local/domain/X/backend/<domid>/<handle>/require >>>>>> + * /local/domain/X/backend/<domid>/<handle>/require/multi-queue- >>>>>> max-queues = "2" >>>>>> + * /local/domain/X/backend/<domid>/<handle>/require/feature-no-csum- >>>>>> offload = "1" >>>>>> + * >>>>>> + * In the example above, network backend will negotiate up to a maximum >>>>>> of >>>>>> + * two queues with frontend plus disabling IPv4 checksum offloading. >>>>>> + * >>>>>> + * This directory and its children entries shall only be visible to the >>>>>> backend. >>>>>> + */ >>>>>> + >>>>> >>>>> What should happen if the toolstack sets something in 'require' that >>>>> the backend cannot provide? I don't see anything in your RFC patches >>>>> to check that the backend has responded appropriately to the keys. >>>> >>>> Hmm, you're right that this RFC doesn't handle that properly - but for the >>>> ones the backend provide I had suggested (albeit not implemented here) >>>> back in the other thread that we could compare the values of feature in >>>> "require" with the one announced to the frontend. But well this wouldn't >>>> cover the non-provided ones, and possibly would fall a bit as a hack. >>>> >>>> I could change the format of the entries within "require" >>>> directory to be e.g. "<id>-<feature-name> = <feature-value>" and the >>>> acknowledgement entry would come in the form "<id>-status >>>> = <error code>". Consequently the lack of a "<id>-status" entry would >>>> have a stronger semantic i.e. unsupported and ignored. The toolstack then >>>> would have >>>> means to check whether the feature was really succesfull set as desired >>>> or not. But then one question comes to mind: should the backend be >>>> prevented to init in the event that the features requested fail to be >>>> set? In which case uevent (on Linux) isn't triggered and xenbus state >>>> doesn't >>>> get changed and toolstack would fail with timeout later on. >>>> >>> >>> I think the backend should not proceed if it can't meet the >>> requirements. But to be clear I also don't think the timeout behaviour >>> should be used to determine if the setting is successful because it is >>> asking other part of the system to pick up the slack and system >>> administrators would be left in the dark (unless there is easily >>> accessible message that can be obtained by libxl to return to system >>> administrators). >> >> That timeout behaviour is already there *I think* (or maybe I have the wrong >> impression)? The alternative is to trigger the uevent and add more logic on >> the > > Yes, it is already there. Libxl will wait for the backend to change its > state for X seconds. > > The difference now is the system administrators can potentially easily > trigger a timeout due to misconfiguration, while previously it is mostly > due to the module not getting loaded or some other failures that are not > the system administrators' fault. > /nods >> hotplug script to check if the parameters were set according to config, but >> OTOH >> you add more complexity there. Or perhaps we can check that the backend set >> to >> its state to Unknown (or some other state) and that determines the failure - >> but >> still no uevent should be triggered. Unless you had something else in your >> mind? > > On the other hand, I don't think adding uevent or whatever other logic > is a good idea and it is a bit risky to rely on the state of driver to > determine failure because we don't have a state machine that applies to > all drivers. Agreed. > We can probably specify a xenstore node in the spec to > return some error code and let libxl read it. With that model old tools > work the same (extra node ignored) but new tools can utilise the new > node. IIRC there could already be some node that can be utilised -- > xenbus_dev_fatal writes message to xenstore, I think. > I almost forgot about xenbus_dev_fatal(). It writes to an "error" entry in the backend|frontend path with the errno plus error message. But it also changes the device xenbus state to XenbusClosed. Taking into consideration your earlier comment you probably meant xenbus_dev_error() instead? netback does allow Initialising state to be directly into Closing, but others might not be the same. > What do you think? I like the idea of having a similar "error" entry in the config|require directory following the same pattern as mentioned in the last paragraph. Something like: <backend|frontend path>/config/error = "<errno> <message>" I would imagine this could be wrapper in a xenbus_config_fatal(). I had suggested a slightly more complicated version of it in a old reply to Paul (at top of this message) with: <backend|frontend path>/require/<id>-<feature-name> = "<feature-value>" <backend|frontend path>/require/<id>-status = "<error code>" But taking morphing it with your comment could also be something like: <backend|frontend path>/config/<feature-name> = "<feature-value>" <backend|frontend path>/config/<feature-name>/error = "<errno> <message>" But either this option or the config/error global one depend on whether people here prefer to deliver all configuration errors at once, or one global "config/error" which would give the first entry with an error. The latter though could lead to the sysadmin having to recreate the domain multiple times to see/handle all the errors. Joao P.S. s/require/config _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |