[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/3] libxl: make creation of xenstore suspend event channel node optional



> -----Original Message-----
> From: Julien Grall <julien@xxxxxxx>
> Sent: 28 February 2020 10:26
> To: Durrant, Paul <pdurrant@xxxxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: Anthony PERARD <anthony.perard@xxxxxxxxxx>; Ian Jackson
> <ian.jackson@xxxxxxxxxxxxx>; Wei Liu <wl@xxxxxxx>
> Subject: Re: [Xen-devel] [PATCH 2/3] libxl: make creation of xenstore
> suspend event channel node optional
> 
> Hi Paul,
> 
> On 28/02/2020 09:28, Durrant, Paul wrote:
> >> -----Original Message-----
> >> From: Julien Grall <julien@xxxxxxx>
> >> Sent: 27 February 2020 22:52
> >> To: Durrant, Paul <pdurrant@xxxxxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxxxxxxxxx
> >> Cc: Anthony PERARD <anthony.perard@xxxxxxxxxx>; Ian Jackson
> >> <ian.jackson@xxxxxxxxxxxxx>; Wei Liu <wl@xxxxxxx>
> >> Subject: Re: [Xen-devel] [PATCH 2/3] libxl: make creation of xenstore
> >> suspend event channel node optional
> >>
> >> Hi,
> >>
> >> On 26/02/2020 16:08, Paul Durrant wrote:
> >>> The purpose and semantics of the node are explained in
> >>> xenstore-paths.pandoc [1]. It was originally introduced in xend by
> >> commit
> >>> 17636f47a474 "Teach xc_save to use event-channel-based domain suspend
> if
> >>> available.". Note that, because, the top-level frontend 'device' node
> >> was
> >>> created writable by the guest in xend, there was no need to explicitly
> >>> create the 'suspend-event-channel' node as writable node.
> >>>
> >>> However, libxl creates the 'device' node as read-only by the guest and
> >> so
> >>> explicit creation of the 'suspend-event-channel' node is necessary to
> >> make
> >>> it usable. This unfortunately has the side-effect of making some old
> >>> Windows PV drivers [2] cease to function. This is because they scan
> the
> >> top
> >>> level 'device' node, find the 'suspend' node and expect it to contain
> >> the
> >>> usual sub-nodes describing a PV frontend. When this is found not to be
> >> the
> >>> case, enumeration ceases and (because the 'suspend' node is observed
> >> before
> >>> the 'vbd' node) no system disk is enumerated. Windows will then crash
> >> with
> >>> bugcheck code 0x7B.
> >>>
> >>> This patch adds a boolean 'suspend_event_channel' field into
> >>> libxl_create_info to control whether the xenstore node is created and
> a
> >>> similarly named option in xl.cfg which, for compatibility with
> previous
> >>> libxl behaviour, defaults to true.
> >>>
> >>> [1]
> >> https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/xenstore-
> >> paths.pandoc;hb=HEAD#l177
> >>> [2] https://access.redhat.com/documentation/en-
> >> us/red_hat_enterprise_linux/5/html/para-
> >> virtualized_windows_drivers_guide/sect-para-
> >> virtualized_windows_drivers_guide-
> >> installing_and_configuring_the_para_virtualized_drivers-
> >> installing_the_para_virtualized_drivers
> >>>
> >>> NOTE: While adding the new LIBXL_HAVE_CREATEINFO_SUSPEND_EVENT_CHANNEL
> >>>         definition into libxl.h, this patch corrects the previous
> stanza
> >>>         which erroneously implies ibxl_domain_create_info is a
> function.
> >>>
> >>> Signed-off-by: Paul Durrant <pdurrant@xxxxxxxxxx>
> >>> ---
> >>> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> >>> Cc: Wei Liu <wl@xxxxxxx>
> >>> Cc: Anthony PERARD <anthony.perard@xxxxxxxxxx>
> >>> ---
> >>>    docs/man/xl.cfg.5.pod.in    |  7 +++++++
> >>>    tools/libxl/libxl.h         | 13 ++++++++++++-
> >>>    tools/libxl/libxl_create.c  | 12 +++++++++---
> >>>    tools/libxl/libxl_types.idl |  1 +
> >>>    tools/xl/xl_parse.c         |  3 +++
> >>
> >> You may want to update xenstore-paths.pandoc as the document mention
> the
> >> node will be created by the toolstack.
> >>
> >
> > The doc already says that:
> >
> > -----
> > #### ~/device/suspend/event-channel = ""|EVTCHN [w]
> >
> > The domain's suspend event channel. The toolstack will create this
> > path with an empty value which the guest may choose to overwrite.
> > -----
> 
> The paragraph you quoted does not suggest the node may not exist. So I
> think you want to update the documentation to reflect the node may not
> exist.
> 
> >>>    5 files changed, 32 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> >>> index 0cad561375..5f476f1e1d 100644
> >>> --- a/docs/man/xl.cfg.5.pod.in
> >>> +++ b/docs/man/xl.cfg.5.pod.in
> >>> @@ -668,6 +668,13 @@ file.
> >>>
> >>>    =back
> >>>
> >>> +=item B<suspend_event_channel=BOOLEAN>
> >>> +
> >>> +Create the xenstore path for the domain's suspend event channel. The
> >>> +existence of this path can cause problems with older PV drivers
> running
> >>> +in the guest. If this option is not specified then it will default to
> >>> +B<true>.
> >>
> >> In the next patch you are going to make device/ rw. Do you see any
> issue
> >> with just not creating the node for everyone? Are PV drivers allowed to
> >> assume a node will be present?
> >
> > Well that's the problem. Some of them may have become accustomed to it
> being present. Also the documentation does say it is created by the
> toolstack (but not when). Perhaps I should update the doc to say the
> toolstack *may* create this path when the domain is created.
> 
> Hmmm fair point. However, this now means you may need to modify your
> configuration file depending on the PV driver installed.
> 
> This is not a very ideal situation to be in when you have to upgrade
> your OS (imagine the new PV driver requires the path).
> 
> How do you envision this to work?
> 

Indeed. Basically this is a control knob you can try if you have drivers that 
don't cope. The scenario is that you have a guest image imported from an old 
(pre-libxl) Xen, you fire it up, it BSODs... what do you do? The bug is in the 
guest drivers - they ought to cope whether the node is there or not - but you 
can't boot the VM to upgrade them. This option allows you to get the VM going 
so you can fix it. Your only other option would be to start the guest paused, 
xenstore-rm the node, and then unpause. 
We could just document that procedure, but where? Hopefully an admin might read 
the xl.cfg manpage when moving to the newer Xen and would see this option was 
available to them.

The fact that xenstore is largely a wild west is not an admin's fault and we 
really ought to try to help them where we can.

  Paul
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.