[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] xenbus: bypass xenbus frontend resume if xenstored is not running

On 02/05/13 10:24, Ian Campbell wrote:
> On Thu, 2013-05-02 at 10:21 +0100, Jan Beulich wrote:
>>>>> On 02.05.13 at 10:24, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:
>>> On Wed, 2013-05-01 at 13:57 +0100, Aurelien Chartier wrote:
>>>> If the xenbus frontend is running in a domain running xenstored or in dom0,
>>>> the device resume is hanging because it is happening before the process
>>>> resume. This patch adds extra logic to the resume code to check if we are
>>>> the domain running xenstored or dom0.
>>>> The frontend will be reconnected later, when the backend resumes from S3.
>>>> This logic is working when xenstored is running in dom0, but has not been
>>>> tested with a xenstore stub domain.
>>>> ---
>>>>  drivers/xen/xenbus/xenbus_probe_frontend.c |   15 ++++++++++++++-
>>>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>>> diff --git a/drivers/xen/xenbus/xenbus_probe_frontend.c 
>>> b/drivers/xen/xenbus/xenbus_probe_frontend.c
>>>> index 3159a37..8583afe 100644
>>>> --- a/drivers/xen/xenbus/xenbus_probe_frontend.c
>>>> +++ b/drivers/xen/xenbus/xenbus_probe_frontend.c
>>>> @@ -89,9 +89,22 @@ static void backend_changed(struct xenbus_watch *watch,
>>>>    xenbus_otherend_changed(watch, vec, len, 1);
>>>>  }
>>>> +static int xenbus_frontend_dev_resume(struct device *dev)
>>>> +{
>>>> +  /* 
>>>> +   * If xenstored is running in that domain, we cannot access the backend
>>>> +   * state at the moment. If we are running in dom0, the domain running
>>>> +   * xenstored is still suspended at that point
>>>> +   */
>>>> +  if (xen_initial_domain() || (xen_store_domain == XS_LOCAL))
>>>> +          return 0;
>>>> +
>>>> +  return xenbus_dev_resume(dev);
>>> When or where does this eventually get called for the init domain or
>>> XS_LOCAL cases?
>> I was about to ask the same question. Plus I don't think the
>> description here or in the overview mail really makes clear how
>> specifically a deadlock would occur here. That's pretty relevant to
>> understand in the light that so far we had no indication of there
>> being any special treatment necessary here, and resume from S3
>> had been working quite fine without that (at least as long as
>> xenstored is running in Dom0 and at least with the traditional/
>> forward-port/non-pvops kernels).
> I think the unusual feature here is that dom0 has a netfront attached.
> Netfront resume is therefore hanging because it is trying to talk to the
> still frozen xenstored process in dom0.
> Ian.
Yes, the unusual feature of having a netfront driver in dom0 is
triggering the S3 issue I described. Ian made me realize this issue
could also happen in Xenstore stub domains.

The root cause of the issue is the assomption that a xenstored process
is running in another domain when the xenbus frontend is being resumed
from S3. This assomption is incorrect if xenstored and the xenbus
frontend are running in the same domain. As Linux kernel is waiting for
all devices to be resumed before resuming userland tasks, the xenbus
frontend resume is blocking the userland process resume, waiting for
xenstored (which cannot run as it is a userland process).

The xenbus_dev_resume function for frontend devices such as nefront will
not be called at all with that patch. I am relying on the fact that the
network backend domain will be resumed after dom0 resume is complete.
When that resume is happening, it will trigger a call to netback_changed
in dom0 netfront. This call will end up resuming xenbus states in netfront.

That logic is working for a dom0 netfront, as we can safely rely on the
fact that the network backend domain will be resumed after dom0 resume
is complete. I don't have a Xen configuration with Xenstore stub domain,
but it would probably need some extra logic to reconnect the frontend
after xenstored is being resumed. The main goal of this patch is to fix
the S3 resume of domains running both a xenbus frontend and xenstored.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.